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Preface 


Mathematical analysis is a huge field that lies at the core of contemporary sciences. 
It provides theories and algorithms enabling the specialists to solve a number of 
problems of utmost importance in a series of domains that touch our everyday life: 
optimal allocation of resources, market equilibria, signal processing, mass trans- 
portation, weather forecasting, celestial mechanics, and so on. The abstract char- 
acter of mathematics amplifies this. Once a problem is solved, an entire family of 
related problems is also solved. So the implications run broad and deep. 

Understanding analysis is thus crucial in the twenty-first century! Large groups 
of students are now interested to prepare themselves for careers in engineering, 
banking, medicine, law, and numerous other fields where analytical skills are very 
much desirable. However, as many people have already noticed, teaching analysis 
to such a diverse audience less familiar with the nature of axiomatic arguments is 
quite a challenging task. 

The role of the first course in analysis which follows elementary calculus is thus 
critical. It should provide a mathematically rigorous approach to the study of 
functions of one real variable, to develop mathematical intuition, to make under- 
standable the importance and limits of computer facilities, and much more. 

The book is intended to familiarize the reader with the basic concepts, principles, 
and methods of analysis and to smoothen the access to more advanced topics. It 
focuses mainly on topics of one real variable case (with additions concerning 
continuity in metric spaces, complex power series, and some elements of functional 
analysis), because this material offers the reader the necessary background for any 
further serious studies and also a glimpse into the aims, scope, and evolution of 
mathematics. 

Contrary to popular belief, the mathematical analysis of one real variable is still 
an active domain, continuing to surprise us with unexpected new results. Also, 
many interesting problems are waiting for an answer, some of which are mentioned 
in our text. Meanwhile, this field has become a valuable source of inspiration for 
much of contemporary research in mathematics. Indeed, many new results and 
methods of higher mathematics originate in the one real variable mathematical 
analysis, revealing a new understanding of some old classical results. 


vi Preface 


The writing of the present book started in 2005, when the second author 
prepared a first version for his graduate students at the University of Craiova. 
A year later, a collaboration of the two authors began at Abdus Salam School of 
Mathematical Sciences in Lahore, and the whole material was rearranged and 
expanded to make the text more versatile. 

Indeed, the book contains plenty of material both for seminars and independent 
study, and the instructor can choose from a large variety of options. Every section 
ends with exercises (of medium and high level) and every chapter ends with a 
section of notes and remarks that provides historical information and supplementary 
material devoted to a better understanding of the present state of art. For more 
flexibility, the three appendices at the end of the book add material on one- 
dimensional dynamical systems, applications of Lebesgue’s differentiation theorem, 
and Stieltjes integral. 

The core material, Chaps. 1-10, accompanied by Appendices A and C, is 
suitable for a year-long course devoted to senior undergraduate students, while 
Chaps. 11 and 12, accompanied by Appendix B, cover a one-semester graduate 
course on Lebesgue integral and its applications. The background assumed for 
using this text is decent courses in both calculus and linear algebra. 

The book uses well-established notations such as: N, Z,R,C (for the sets of 
natural, integer, real, and complex numbers respectively), N* (for the set of all 
positive natural numbers), R, (for the set of all nonnegative real numbers). 

The sign LI] designates the end of a proof. 

For the convenience of the reader, a List of Symbols and an Index of termi- 
nology are supplied at the end of the book. 

An accompanying problem book is in preparation and a dedicated web page will 
be available at http://math.ucv.ro/ ~ niculescu/Real_Analysis_On_Intervals.html] to 
keep our readers updated. 

Last but not least we would like to thank our many friends and colleagues who 
gave us suggestions, advice, and support. In particular, we wish to thank Nicolae 
Cindea, Flavia-Corina Mitroi-Symeonidis, Waleed Noor, Gabriel Prajitura, Ionel 
Roventa, and Andrei Vernescu for their help. Special thanks are due to Liliana 
Niculescu, Satish Shirali, and Eleutherius Symeonidis, who read a good deal of the 
final version of our manuscript and made a number of helpful suggestions for 
corrections and improvements. 


Craiova, July 2014 A.D.R. Choudary 
Constantin P. Niculescu 
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Chapter 1 
The Real Numbers 


Real analysis of one variable is based on a description of the main structures (alge- 
braic, ordered and metric) of R, as well as on the connections between them. 
Inevitably, this touches set theory, the foundational theory for the entirety of mathe- 
matics. Some basic facts and terminology are recalled in Sect. 1.1. 

The main object of interest in this chapter is the set of real numbers. 

Everyone is familiar, in some sense, with real numbers. Traditionally they are 
understood as infinite decimal expansions such as 


= 0.0000000000... 
= 0.5000000000... 


=O.1111111111... 


0 

1 

7 

1 

9 
J/2 = 1.414213 5623... 


and, in general, 
a = N.d,d2d3d4dsdod7dgdod\ .. . 


where N is an integer and d1, d2, d3,... € {0,..., 9}. 

The set R of real numbers is also known as the real line. The reason is that by 
considering a straight line, one can put real numbers in a bijective correspondence 
with the points of the line. This is done via a system of coordinates. 

Rational numbers are distinguished among all real numbers by the fact that their 
decimal expansions are eventually periodic. The extension of field operations from 
rational numbers to all real numbers is quite involved and needs compatibility with 
the order. Considering this as a task of mathematical logic and foundations, we 
restrict here to an abstract description of the set R via a list of axioms consisting 
of the field axioms, order axioms, and completeness axiom. This list completely 
characterizes IR among the ordered fields. See the Notes and Remarks at the end of 
this chapter. 
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2 1 The Real Numbers 


1.1 Preliminaries of Set Theory 


We assume that the reader has some knowledge of what is usually called the Naive Set 
Theory, but not necessarily in the axiomatic system ZFC (Zermelo-Fraenkel system 
with Axiom of Choice). That is why we next discuss informally some aspects of set 
theory that are relevant to the study of analysis. The interested reader will find in the 
book by Halmos [1] a systematic treatment of this subject. 

The ZFC system legitimates very basic notions such as the empty set J, the set N 
(of all natural numbers), the power set of a set, the image of a set by function, and so 
on. It also allows the usual operations with sets: union, intersection, difference, and 
Cartesian product. 

The ZFC system does not allow the existence of any set that contains itself as an 
element. Consequently, there does not exist a set of all sets. 

Throughout this book, we use x € X to denote that x is an element of the set X 
(equivalently, x belongs to X), and x ¢ X to denote that x is not in X (x does not 
belong to X). The empty set ¥ is the set with no element. 

We use the words collection and family as synonyms for a set. Usually, a set 
is indicated within curly brackets, either by listing its elements (as in the case of 
{0, 1, 2, ..., 9}, the set of decimal digits), or by specifying its characteristic property 
(the case of the set {n : n € Zand 2 divides n}, of even integer numbers). When 
listing a set, the elements appear without repetition (but the order in which they are 
listed is unimportant). 

Given two sets X and Y, we say that X is a subset of Y (or X is included in Y) if 
every element of X is also an element of Y (equivalently, if x € X implies x € Y). 
We denote that X is a subset of Y as X C Y or, equivalently, Y D X. 

The power set P(X) of a set X is the set of all subsets of X. 

Two sets X and Y are equal (that is, X = Y) if they have the same elements. More 
formally, 

X=/Y ifandonlyif X CYandYcX. 


Thus, to prove that X = Y is equivalent to prove the inclusions X C Y and Y Cc X. 
If X Cc Y and X ¥ Y, we say that X is a proper subset of Y (or the inclusion 
X CY is strict). 
If X and Y are sets, we define X U Y as the set {x : x € X orx € Y}, and call 
X UY the union of X and Y. More generally, if (X;)jc7 is a family of sets indexed by 
the nonempty set J, we define its union as the set 


U. Xi = {x : there is i € J such that x € X;}. 
1€ 


When the elements of J can be listed, one can use alternative notations. For example, 
the sets Uje.2,..n) Xi and Ujen Xi are also denoted LU), X; and Uj<o Xi respec- 
tively. 

For given sets X and Y, we define X M Y as the set {x : x € X andx € Y}, and 
call X 1 Y the intersection of X and Y. If (X;)jc7 is a family of sets indexed by the 


ie{ 
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nonempty set J, we define its intersection as the set 


; ai = {x: x € X; for every i € J}. 
1€ 
The notations (}/_; X; and ()?<, X; have the obvious meaning. 
Two sets are said to be disjoint if their intersection is the empty set. The sets of an 
indexed family (X;)je7 of sets are pairwise disjoint if X; 1 X; = @ whenever i # j. 
The difference of two sets X and Y is defined as the set 


X\Y={x:x eX andx ¢ Y}. 


The Cartesian product of two nonempty sets X and Y is the set X x Y of all 
ordered pairs (x, y) such that x € X and y € Y. Inan ordered pair, the order in which 
the objects appear is significant. Thus (x, y) = (u, v) if and only if x = wu and y = v. 

A relation between two nonempty sets X and Y is any subset p of their Cartesian 
product X x Y. Usually, the fact that (x, y) € p is denoted by 


Xpy. 


The domain of p is the set dom p = {x € X : (x, y) € pe for some y € Y} and the 
range (or image) of p is the set mgp = {y € Y : (w, y) € p for some x € X}. The 
inverse of p is p~! = {(y,x) : (x,y) € p}. A relation p is called single-valued if 
(x, y) € p and (x, z) € p imply y = z. 

A relation ~ C X x X is called an equivalence relation if it is 


(a) (reflexive) x ~ x for all x € X; 
(b) (symmetric) x ~ y implies y ~ x whenever x, y € X; 
(c) (transitive) x ~ y and y ~ z imply x ~ z whenever x, y, z € X. 


If ~ is an equivalence relation on X and x € X, then the equivalence class of x 
is the set ¥ = {y: y € X, y ~ x}. We will denote by X/~ the set of all equivalent 
classes. Any equivalence relation on X gives rise to a partition of X, that is, to a 
division of X into pairwise disjoint nonempty subsets. Precisely, 


a es i. 


Conversely, any partition X = [)j;<, Xi gives rise to an equivalence relation 
~ on X by setting x ~ y precisely when x and y are in the same set X; of the 
partition. Thus, the notions of equivalence relation and partition are equivalent. 

A relation p between the nonempty sets X and Y is called a function (from X 
to Y) if it is single-valued and dom p = X. It is usual to denote a function f from 
X to Y by f : X — Y, and the fact that (x, y) € f by y = f(x). Synonyms for 
function are map, mapping, application, transformation. If f : X — Y is a function, 
we call X the domain of f, Y the codomain and rng f = {f(x) : x € X} the image 
(or the range) of f. The image of an arbitrary subset M of X under f is the set 
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SM) = {f(x) : x € M}, while the inverse image under f of a subset N of Y is the 
set f—!(N) = {x © X : f(x) EN}. 

The graph of a function f : X — Y is the set graph f = {(x, f(x)) : x € X}. When 
X and Y are numerical sets, the graphical representation of a function (as a curve on 
a Cartesian plane) offers a deep understanding of its behavior and will be frequently 
used in this book. 

Among the functions from X into itself we should mention here the identity of X, 


idy :X > X, idy(x) =x. 


The restriction of a function f : X — Y to a nonempty subset A of X is the 
function f|4 : A — Y given by f|4(a) = f(a) for every a € A. The restriction of the 
identity of X to a subset A is called the inclusion function. 

Given two functions f : X — Y and g: U — V such that rng f C dom g we can 
consider their composition g of : X — V, given by 


(gof)(x)=g9(f (x)) foreveryx eX. 


The condition rng f C dom g is automatically verified when Y = U. Note that the 
composition of functions verifies the relation 


(hof)og=ho(fog). 


A function f : X — Y is called injective (or one-to-one) if x, A x2 in X implies 
f(x) € f(x2), equivalently, if the equality f(x1) = f(x2) implies xj = x2. The 
function f is called surjective (or onto) if for each y € Y, there exists x € X such 
that y = f(x), in other words, if its range is Y. 

The functions that are injective and surjective are called bijective. Iff : X —> Y is 
bijective, for every y € Y there exists a unique x € X such that y = f(x). This allows 
us to define the inverse of f as the function f~'! : Y > X defined by f~!(y) = x 
when y = f(x). The inverse function is also bijective and verifies the relations 


f lof =idy andf of! = idy. 


Iff : X — Y isa surjective function, then there exists a function g : Y — X such 
that f o g = idy. This means that for every y € Y, one can choose an element x in 
the set {u : f(u) = y}. Surprisingly, the possibility of such selections is assured by 
an axiom. 


Axiom of Choice (E. Zermelo, 1904) For every nonempty family (X;)ie7 of nonempty 
sets, there exist functions x : J + Uje; X; such that x(@) € X; for every i € J. 


Such functions x are usually called choice functions. The set [],<, X; of all choice 
functions is called the Cartesian product of the family (Xj)ic7. When/ = {1, 2, ..., n}, 
then the Cartesian product [] ic Xi is usually denoted X| x X2 x--- xX, and coincides 
with the set of all n-tuples (x1, x2, ..., X,) with x; € X; fori = 1, 2,..., 7. 
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The Axiom of Choice says that the Cartesian product of every nonempty family 
(Xj)ice of nonempty sets is a nonempty set. In particular, when dealing with a single 
nonempty set, say X, it asserts the existence of mappings from {1} to X, a fact that 
legitimates the phrase “letx € X.” The possibility to choose an element of anonempty 
set does not mean that we can indicate a concrete element in that set. For example, we 
know that the set of all prime numbers is infinite, but the largest known prime number 
(until January 2014) is 2°7-885:!6! _1, a number with 17,425,170 digits. However, 
the phrase “let p a prime number greater than 2>7885.161” is perfectly legitimate. 

The Axiom of Choice has many important consequences. Some of them are men- 
tioned in this book, starting with Sect. 1.6. 

A presentation of the system ZFC along with many valuable comments on how 
mathematics is built on set theory and mathematical logic, can be found in the paper 
by Marek and Mycielski [2]. 


Exercises 


1. Suppose that X, Y, Z are sets. Prove that: 
(a) (XUY)UZ=XU(YUZ) and (XNY)AZ=XN(YNZ); 
(b) XN(YUZ) = (XN Y)U(KNZ) and XUYNZ)= (XUY)A(XKUZ). 
2. (De Morgan’s rules). The complementary set of a subset A of the set T is by 
definition the set CA = 7\A. Suppose that (A;)jey is a family of subsets of 7. 
Prove that 


C (U,.,4:) _ ier CA; and C(M 41) _ ae Cai. 
3. Suppose that A, B, C are sets. Prove that 
(Ax B)U(AxC)=Ax (BUC) and (Ax B)N(AxC)=Ax (BNC). 


4. Suppose that f : E — F and g: F > Gare two functions. Show that: 
(a) if f and g are injective (respectively surjective) so is g of; 
(b) if f and g are bijective, then (g of)! =f7!og7!; 
(c) if g of is injective then f is injective; 
(d) if gof is surjective then g is surjective. 
5. Suppose that f : E — F is a function and A and B are two subsets of E. Prove 
that: 
(a) f (AUB) =f(A) Uf(B); 
(b) f (ANB) C f(A) Of (B) and give an example where this inclusion is strict; 
(c) the equality f (AN B) = f(A) Nf (B) holds for all subsets A and B of E if and 
only if f is injective. 
6. If : E — F isa function and C and D are two subsets of F,, show that: 
(a) f-! (CUD) =f (©) Uf-"(D); 
(b) f-' (COD) =f CO) Nf). 
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7. (Congruence modulo an integer p > 2). Consider on Z the relation 


m = n (mod p) if and only if p divides m — n. 


(a) Show that this is an equivalence relation. How many equivalence classes has 
this relation? 

(b) Let Z, be the set of equivalent classes of this relation. Consider the function 
g : Z—> Zp given by g(x) = X. Show that ¢ is surjective. 

(c) Prove that a function f : Z — Z is periodic, of period p (that is, f(m + p) = 
f(m) for all m € Z) if and only if there is a function F : Zp > Z such that 
f=Fo@. 

8. Consider the set F(N, Q) of all functions from N to Q, endowed with the relation 

f ~ gif and only if f(n) = g(n) for all n > N, where N is natural number that 

depends on f and g. Prove that this is a relation of equivalence. 


1.2 An Abstract Overview of the Real Number System 


The field of real numbers is a set IR endowed with two algebraic operations, the 
addition (denoted +) and the multiplication (denoted -) and also with an order relation 
(denoted <), with respect to which a list of 7 axioms is verified. They will be detailed 
in what follows. 

The elements of R are called real numbers. 

The first three sets of axioms are only the general description of an arbitrary 
commutative field. In this respect, the law of distributivity represents a condition 
of compatibility between the two algebraic operations, addition, and multiplication, 
described by the Axioms 1.2.1 and 1.2.2. 


1.2.1 Addition Axioms (called in this order, well-definedness, associativity, com- 
mutativity, the existence of a neutral element (denoted 0), the existence of a negative 
(or opposite) for each number (the negative of x is denoted —x)): 


(Al) If aand b belong to R, then their sum a + b is also in R. 
(A2) Ifa, b,c belong to R, then 


(a+b)+c=a+(b+c). 
(A3) If aand b belong to R, thena+b=b-+a. 
(A4) R contains an element 0 such that a+ 0 = a forallae R. 


(A5) For each a € R, there is an element —a € R such that 


a+ (—a) = 0. 
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1.2.2 Multiplication Axioms (called in this order, well-definedness, associativity, 
commutativity, the existence of a unit (denoted 1), the existence of an inverse 
(or reciprocal) for each nonzero element (the inverse of x is denoted x! or +)): 


(M1) Ifaand b belong to R, then their product a - b is also in R. 
(M2) Ifa, b, c belong to R, then 


(a-b)-c=a-(b-c). 


(M3) Ifa and b belong to R, thena-b=b-a. 
(M4) R contains an element | different from 0 such that a- 1 = a for alla € R. 
(M5) For each a € R, a 4 0, there is an element i € R such that 


a-—-= 1. 


The product of two real numbers a and b is also denoted by ab. 


1.2.3 Distributivity Law For all a,b,c inR, 
a-(b+c)=a-b+a-c. 


The Axioms 1.2.1—1.2.3, tell us that R is a commutative field. In particular, they 
motivate the existence and uniqueness of the solution of every equation of the form 


a+x=b, 


or of the form 
c:x=d, 


whatever are a,b,c,d € R,c £0. 
Three immediate consequences of the above discussion are as follows: 


a-0=0-a=0, 
-a=(-la, 


and 
ab = O implies a = Oorb=0. 


The next three groups of Axioms 1.2.4—1.2.6 make R an ordered field. 
1.2.4 Order Axioms The relation < has all properties of an order relation, 


(OR1) reflexivity, that is a < a for all a; 
(OR2) antisymmetry, that is, a < b and b < a implies a = b; 
(OR3) transitivity, that is, a < band b < cimpliesa <c. 
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It is useful to introduce a strict ordering <, given by 
a<b ifandonlyif a<bandaFb. 


For convenience, a < bis also denoted by b > a(anda < bis thesameasb > a). 

A number a is called nonnegative if a > 0 and positive if a > 0. Reversing 
the inequalities, we get the notions of nonpositive number and of negative number 
respectively. 


1.2.5 Axioms of Compatibility (between the algebraic structure and ordering): 


(AS/O1) a < bandc € Rimpliesa+c<b+c; 
(AS/O2) a < bandc > O implies ac < be. 


From Axioms 1.2.5 it follows immediately the possibility to add side by side a set 
of inequalities: 


a) <b, ...,d, < by implies ay +--+ +a, <b) +--+ +)bn, 
the final inequality being strict if at least one of the initial inequalities is strict. Another 
consequence is 


a<bandc>0O implies ac < be. 


In particular, 
a> Oandb > 0 implies ab > 0. (1.1) 


From Axioms 1.2.5 also follows that 
a<b ifandonlyifb—a>0 


and 
a<b_ ifandonlyifb—a> 0. 


1.2.6 Total Ordering Axiom Given two real numbers a and J, either a < b, or 
b<a. 


The last axiom can be reformulated as follows: For every real number a, one (and 
only one) of the following three possibilities occur: 


a<0O, ora=0, ora>0. 


AS a consequence, 
a > Oif and only if —a <0. 


By combining the Total Ordering Axiom and Axiom 1.2.5 (AS/OS2) we infer 
that 
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a < Oand b > Oimplies ab < 0, 


while 
a < Oand b < Oimplies ab > 0. 


Indeed, in the first case —a > 0, so by Axiom 1.2.5 (AS/OS2) we deduce that 
— (ab) = (—a) b = 0, whence ab < 0. The second case can be treated similarly. 
As a consequence, we obtain that all squares are nonnegative. 
Therefore 1 > 0 and this fact yields two important facts relating the ordering and 
the operation of division: 


1 
a > O implies — > 0 
a 


and \ i 
0 <a <bimplies0 <—<-. 
b a 
On the other hand, from | > 0 we infer the natural ordering of the set N, of 
natural numbers: 


O0<1<2=14+1<3=24+1<---<n<n4+1<:::; 


that makes N the smallest subset of R that contains 0 and once n is in N so is its 
successor, n + 1. N is also well ordered in the sense that every nonempty subset A 
of it has a smallest element, that is, an element a € A such that a < x forall x € A. 
This fact is at the heart of the following principle: 


Principle of Mathematical Induction Let P(n) be a statement which depends on 
the natural number n. If 


(a) P(no) is true, and 

(b) foralln € N, n = no, assuming that P(n) is true, it follows that P(n + 1) is 
also true, 
then P(n) is true for all natural numbers n > no. 


A variant of this principle, called the Principle of Strong Mathematical Induction, 
is presented in Exercise 9, at the end of this section. The Principle of Mathematical 
Induction has many important applications. One of them is as follows: 


The Binomial Theorem 


(a+b) = n a+ n abt n Geb es n ab" 1 4 n b” 
0 1 2 n—1 n 


foralla,b € Randn € N*. 
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The Total Ordering Axiom is also responsible for a large variety of functions of 
utmost importance: 
the signum function, 


-1 ifx<0 
sgnx = 0 ifx=0 
1 ifx>0; 


the positive part function, 


the negative part function, 


_|-« ifx<0 
~*~ =) 0 ifx>0; 


the absolute value (or modulus) function, 


The graphs of the last three functions are sketched in Fig. 1.1. 
They are connected by the formulas 


x=x'-—x 


ea cee oe, ae 
which yield 
. Six] +x) and x7 * ( ) 
xT == (|x| +x x” = =(|x|-—x). 
2 2 


Moreover, if a > 0, then 


Ix] <a ifandonlyif -—a<x<a. 


Fig. 1.1 The graphs of the functions x, x~ and respectively |x| 
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The main properties of the absolute value function are: 


(AV1) |x| = 0; 

(AV2) |x| = 0 if and only if x = 0; 
(AV3) [x - yl = Ix] - Lyk 

(AV4) |x + yl < |x] + ly. 


The property (AV4) is known as the absolute value inequality. The equality occurs 
in the absolute value inequality if and only if xy > 0 (that is, if x and y belong to the 
same semi-line from 0). 

A companion of the absolute value inequality is 


Ilx| — |yll < lx +yl, 


which is a direct consequence of it. 

The comparison of pairs of real numbers can be extended easily to finite families of 
real numbers. This is done via the notions of minimum and maximum. The minimum 
of a pair of numbers is defined by 


i xy ifxy <x 
min{x,, x2} = Pee 
X22 Wx. xX] 


while for families of n > 3 numbers, the minimum is defined inductively by 
min{x,, ..., X,} = min{min{x, ..., X,—1}, x}. 


The maximum (denoted as max) is defined in a similar way replacing < by >. 

In the general case, the problem of comparing infinitely many real numbers, turns 
out to be more subtle. Like in the celebrated case of Parallels’s Postulate, we solve 
the matter by introducing a new axiom. In order to state it, we need a definition, 
inspired by the case of finite sets. A nonempty subset A of R is called bounded above 
if there is M € R such that a < M for all a € A; anumber M with this property is 
called an upper bound of the set. If M is an upper bound of A, then every number 
greater than M is also an upper bound. 

Given a subset A of R that is bounded above, we say that a real number z is the 
least upper bound, or the supremum of A (abbreviated, z = sup A) if: 


(LUB1) z is an upper bound of A; 
(LUB2) z is less than or equal to any other upper bound of A. 


Clearly, this definition has the same content if (LUB2) is replaced by the following 
condition: no matter how small is e > 0, the number z— ¢ is no more an upper bound 
of A. Thus, z = supA holds true if and only if a < z for alla € A and for each 
€ > 0, there is an element ag € A such that 


dg > Z—€E. 
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Every nonempty finite set of real numbers is bounded above and admits a least 
upper bound (which is its maximum). In the case of an arbitrary infinite subset, the 
existence of the least upper bound is decreed by an axiom: 


1.2.7 Completness Axiom (also known as Dedekind’s Axiom): Every nonempty 
set of real numbers that is bounded above admits a least upper bound. 


In a similar way, replacing < by >, we may introduce the notions of set bounded 
below, as well as of lower bound and greatest lower bound (also called infimum). 
The existence of the greatest lower bound for nonempty sets of numbers bounded 
below follows from the following result. 


1.2.8 Proposition Every nonempty and bounded below set A of real numbers admits 
a greatest lower bound, inf A. 


Proof Put —A = {—a : a € A}. If x is a lower bound for A, then —x is an upper 
bound for —A. Use the fact that x < a is equivalent to —x > —a. By Axiom 1.2.7, 
the set —A admits a least upper bound, say a. It is easy to see now that —a is the 
greatest lower bound of A. 


The subsets of R, that are in the same time bounded above and below are 
called bounded subsets. Necessarily, these sets admit both a supremum and an infi- 
mum (provided they are nonempty). The subsets which are not bounded are called 
unbounded. 

Among the bounded subsets of R, one distinguishes the bounded intervals. For 
a,b € R, witha < b, we may consider the following four types of bounded intervals 
of endpoints (or extremities) a and b: 


(a, b) = {x: x € R, a < x < b}, the open interval; 

[a, b) = {x: x ER, a < x < bd}, the closed-open interval; 
(a, b] = {x: x ER, a < x < b}, the open-closed interval; 
[a,b] = {x: x € R, a < x < Dd}, the closed interval. 


The nonempty bounded intervals can be equally described in their connection 
with the intervals of endpoints 0 and 1. For example, 


[a,b] = {x:xER, x= (1-1) a+tb forsomet é (0, 1]}. 


The interval [0, 1] is known as the unit interval. 

A useful remark is that a subset of R is bounded if and only if it is included in a 
bounded interval. Since every bounded interval of endpoints a and b is included in 
a bounded interval of the form [—M, M], with M = max{|a|, |b|}, the property of 
boundedness of a nonempty subset A of R means precisely the existence of a positive 
number M such that 

la|} <M forallaeA. 


The length of a bounded interval J of endpoints a < b is defined by the formula 
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£1) = b—a. 


We say that an interval is nondegenerate if is nonempty and does not reduce to a 
single point. The length of a nondegenerate interval is a positive number. 

It is useful to transfer the terminology from sets of real numbers to real-valued 
functions, according to the properties of their ranges. 

Thus, a function f : X — R is called bounded below, bounded above, bounded 
and unbounded respectively if f(X) has this property. 

The characteristic function of any subset A of X, 


Kak > Ry JAG) = ; ‘ ak 
is an example of bounded function. 

The greatest lower bound inf f (of a bounded below function f) and the least 
upper bound supf (of a bounded above function f) are defined respectively by the 
formulas 

inf f = inf{ f(x) : x € X} 


and 
supf = sup{ f(x) : x € X}. 


Sometimes, inf f and sup/f are denoted inf f(x) and inf f(x). 
xex xex 


Suppose that X is a nonempty subset of R. A function f : X — R is called 
increasing if 
x < yin X implies f(x) < f(y) 


and strictly increasing if 
x < yin X implies f(x) < f(y). 


In a similar manner, we can introduce the notions of decreasing function and of 
strictly decreasing function. The functions which are increasing or decreasing consti- 
tute together the class of monotone functions; the class of strictly monotone functions 
has an obvious meaning. 

If f : [a, b] Ris a monotone function, then its bounds are precisely the values 
at the endpoints. Any function which is either strictly decreasing or strictly increasing 
is injective. 

An important class of functions defined on R is that of polynomial functions with 
real coefficients. The polynomials functions of degree 0 are by definition the constant 
functions not identically 0, while the polynomial functions of degree n => 1, are the 
functions of the form 


1 


P(x) = agx" + ayx”) +++) +a, 
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where ao, a1, ..-,d, € R and ag 4 0. Two special cases are the affine functions 
(polynomial functions of the form ax + b with a,b € R) and quadratic functions 
(the polynomial functions of degree 2). 

A polynomial function of degree n > 2 may not be monotone. However, as 
suggests the case of quadratic functions, IR can be decomposed into a finite union of 
intervals on which these functions are strictly monotone. The details in the general 
case need Calculus and are given in Chap. 8. 

Symmetry properties play an important role in Mathematics. In the context of 
real-valued functions f defined on a nondegenerate interval J, the symmetries are 
described by the bijective maps o : graph f — graph/, different from the identity. 
In particular, if 7 is R or a bounded interval symmetric with respect to the origin (that 
is, J is (—a, a) or [—a, a]), we encounter two special types of symmetric functions, 
the even functions and the odd functions. A function f defined on such an interval is 
called even if f(—x) = f(x), and odd if f (—x) = —f (x) for all x. Examples of even 
functions are: the constant functions, the quadratic function x’, the absolute value 
function, etc. The identity, and the cubic function x* are examples of odd functions. 
Every function f is the sum of an even function and an odd function: 


fQ@) + f(x) fe) —fCx) 
2 2 ; 


f@) = 


Another instance of symmetry is periodicity. See Sect. 1.4. 
Building new functions from old ones is a usual practice in analysis. The most 
common transformations of a function f of real variable are: 


S@) +k, vertical shift by & units (upward if k > 0 and downward if k < 0); 

f( +h), horizontal shift by / units (to the right, if h < 0, respectively to the left, 
if h > 0); 

af (x), vertical dilation by a factor of a > 0 (a stretching if w > 1, respectively, a 
shrinking if 0 < a < 1); 

f (ax), horizontal dilation by a factor of 1/a; 

—f (x), reflexion with respect to the x-axis; 

f(—x), reflexion with respect to the y-axis. 


1.2.9 Remark (Linear lattices of functions) Assuming that A is an arbitrary non- 
empty set, the set F(A, R) of all functions f : A > R admits a natural structure of 
commutative algebra with unit (the constant function identically 1), when endowed 
with the pointwise operations, that is, 


F+M@) =f) + 9@) 
(af) (x) = af (x) 
Fa) =f@g@). 


This algebra can be also endowed with the pointwise order, defined by 


<g _ ifand only if f(x) < g(x orallx € A. 
f <g ifandonly if f(x) < g(x) forallxe A 
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Accordingly, a function f : A > R is called positive (that is, f > 0) if 
f@) =0 forallxeA 


and strictly positive (denoted f > 0), iff(x) > Oforallx € A. The negative functions 
and the strictly negative functions can be introduced in a similar manner (by reversing 
the inequalities). 

The above order relation on F(A, R) satisfies the Axioms 1.2.4—1.2.5 (defining a 
relation that is compatible with the algebraic structure). The Total Ordering Axiom 
doesn’t work for F(A, R) when A contains at least two distinct elements because 
not every two functions are comparable. However, as in the case of real numbers, 
every pair of functions admits a least upper bound and greatest lower bound defined 
pointwise by the formulas 


(inf{ f, g}) (x) = inf{ f@), 9@)} 
(sup{ f, g}) (&) = sup{ f(), g@)} 


for all x € A. Even more, F(A, R) verifies the Completeness Axiom. 
The positive part f*, the negative part f~ and the absolute value (or modulus) 
\f| of a function f are defined pointwise by formulas of the type 


fT@ =F)", xEA. 


Thus, the positive part of a function is a positive function and the three functions 
mentioned above are related by 


f=ft—f and [f|=ft+f-. 


By a linear lattice of functions on the set A we mean any linear subspace A of 
F(A, R) such that 


f,g9¢€A implies inf{f, g}, sup{f, g} € A. 
Clearly, this implication extends to all finite families of functions of A. Since 


ft+g-lf-al 
2 


inf {f, g} = and sup{f, g} = ee eae , 
it follows that a linear subspace A of F(A, R) is a linear lattice of functions if and 
only if 

|f| € A whenever f € A. 


As a consequence, F,(A, R), the space of all bounded functions f : A > R, isa 
linear lattice. In the following chapters we will encounter many other examples of 
linear lattices. 
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A sequence of elements in a set X is a function a : N > X. A subsequence of 
a sequence a is a composition a o k, where k : N > N is any strictly increasing 
function. Usually, a(n) (the value of a at n) is denoted a,, and the sequence a itself 
is denoted by one of the symbols 


(Qn)neN, (An)n>0 or (dn) n; 


dn is referred to as the element of index n. With this convention, a subsequence ao k 
is denoted (ax, )n OF (4x(n) )n- Necessarily, k, > n for every n. 

Sometimes, it is useful to consider sequences indexed over subsets of Z of the 
form {k, k +1, k+2, ...}, where k € Z; see for example (1/n)ncn~. The theory 
of these sequences is very much the same as that of usual sequences, indexed by N, 
and needs only obvious adaptations. 

Sequences are a basic tool in describing recursive processes. This will be discussed 
in the following chapters. 

Many notions concerning functions automatically transfer to sequences. So are the 
notions of bounded sequence, increasing sequence, decreasing sequence, monotone 
sequence, Strictly increasing sequence , positive sequence etc. 


Exercises 


1. Let a, b,c € R. Prove that: 
(a) a+ min{b, c} = min{fa+b,a+c}; 
(b) a+ max{b, c} = max{a+ b,a+c}; 
(c) |min{a, b} — min{a, c}|] < |b— cl]; 
(d) |max{a, b} — max{a, c}| < |b—cl. 
2. Let A, B C R be two nonempty sets of real numbers. Prove that: 


(a) IfA C B, then inf B < inf A < supA < supB; 
(b) If A, B C R are bounded subsets, then A U B is bounded and 


min (inf A, inf B) = inf (A U B) < sup (A U B) = max (supA, sup B) . 
3. Let A and B be two nonempty subsets of R. Put 
—-A={-a:aeéA} and A+B={a+b:aeA, be B}. 


Prove that: 
(a) If A is bounded, then —A is bounded and 


sup (—A) = —infA, inf (—A) = —sup (A); 
(b) If A and B are bounded, then A + B is bounded and 


inf (A) + inf (B) = inf (A + B); 
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(c) If one of the sets A and B is bounded and the other is unbounded, then A + B 
is unbounded. 


. Let f : A x B — R be a bounded above function. Prove that 


sup f(x, y) = sup(supf(x, y)) = sup(supf (x, y)). 
(x,y)EAxB xeA yeB yeB xeEA 


. Let A be a subset of R and let f{, ..., f, : A — R be some bounded functions. 


Prove that 
inf fj +---+inff, < inf (fi +---+f,) 


and 
supfi +---+supf, > sup (fj +---+ Jn). 


. (Cauchy-Buniakovski-Schwarz inequality). Prove that 


(Sos) -(Z)(E9) 


for all aj, ..., dn, D1, ..., by € R. 

Prove that the above inequality becomes an equality if and only if there is t € R 
such that either b} = fay, ..., by = tay, or a, = thy, ..., Gn = thn. 

(Hint: Use the fact that °7_, (agx + by)” > 0 for all x € R.) 


. (Chebyshev’s Algebraic Inequality). Suppose that x1 > x2 > --- > x, and 


y, > y2 > +--+ > yp. Prove that 


1 n I n I n 
re > 4% es (: Yn)(7 >) 
k=1 k=1 k=1 


If the two families of numbers are monotonic of opposite direction, then this 
inequality should be reversed. 


. (The rearrangement inequality of Hardy-Littlewood-Polya). Given a finite family 


X1,...,X, Of n real numbers, we denote by xy Se > xt its decreasing 
rearrangement. Prove that 


n n n 
Donen S Dt Ss Day 
k=1 k=1 k=1 


for all families x1, ...,x, and y1,..., y, of real numbers. 


. The Principle of Strong Mathematical Induction has the following statement: 


Let P(n) be a statement involving the natural number n. If 
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(a) P(ng) is true, and 

(b) for each k > no, assuming that P(n) is true for n = no, no + 1,..., k, it is 
also true forn=k-+1, 
then P(n) is true for alln € N,n> no. 
Infer this principle from the usual Principle of Mathematical Induction. 


10. Consider a sequence (a,,)n>1 Of positive numbers such that 
(ay tay +++ + an) =aptay+---+a3, n=l. 


Prove that a, = n for alln > 1. 
11. The Fibonacci numbers Fo, F\, F2, ... are defined by 


Fo = F, = Land F, = Fy_-1 + Fn_2 for all n > 2. 


Prove that F, < (7/4)” for alln > 0. 


1.3 A First Encounter with the Exponential Function 


An immediate consequence of Completeness Axiom is the existence of the nth root 
of a nonnegative number: 


1.3.1 Proposition For every nonnegative real number a and every integer n > 2, 
there is a unique nonnegative solution of the algebraic equation 


X =4. 


This solution, usually denoted ¥/a or a!/”, is called the nth root of a. 


Proof Clearly, s/0 = 0. Assume now that a > 0. The formula 
xt y" =(x—- y) Qe! 4x ty Heice ae), 


for x and y nonnegative, shows that x” > y” if and only x > y. This proves the 
uniqueness of the positive solution of x” = a. 

The set A = {x : x > Oandx” < a} contains 0 and is bounded above by 1 if 
a < 1, and by aif a > 1. Thus, the Completeness Axiom implies the existence of 
z= supA > 0. We will show that z” = a. 

Indeed, assuming that r = a — z” > O and taking into account the Binomial 


Formula, we infer that every ¢ € (0. min {1 —t}) verifies the formula 
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(eter a( ie (Perret (Feet eet ce 4 (MY 
0 1 2 n—-1 n 
=z4¢ n ged n gee te n zee n etl 
1 2 n—-1 n 
ag pel[ ote |e ae ee (" 
1 2, n—-1 n 


Ha tel(zet+ 1" - 2") <2" +a-2" =a, 


a fact that contradicts the choice of z. Therefore, z” > a. 
The case where s = z” — a > 0, can be treated in a similar way, by noticing that 


every € € (0. min {1 =t-=}) verifies the formula 


a (:)" - (e's J (ete aot" (‘)e 
ne -e[(Jer- (errs (er 
cecd(den Ger-C) 
ey aie iy? + )é ahead te . 


=2?—e[(zt+1)!"-2) > 7 +a-2 =a. 


Since z is the least upper bound of A, there should exist x € A such that z— € < x. 
Then a < (z—e)” < x” < a, acontradiction. 
Consequently, z” = a. 


Another proof of Proposition 1.3.1, based on the Intermediate Value Theorem, is 
available in Example 6.4.3 below. 

It is worth noticing that when a > 0 and the exponent n > 0 is an even natural 
number, then the equation x” = a has two real solutions: a'/” and —a!/", When 
a < Oand the exponent n is an odd natural number, then the equation x” = a admits 
the solution —(—a)!/". 

Powers of a > 0 with positive rational exponents are defined using the formula 


qin/n — Gy = (alm)" 


where the last equality is motivated by the uniqueness part of Proposition 1.3.1. 
Powers with nonpositive exponents are defined by using formulas 


1 
@=1,a'=-anda"/ = form,néN, n> 0. 
a 


qn/n 


It should be noticed that the value of a’/” is independent of the representation of 
the rational exponent m/n. 
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1.3.2 Exponential and Logarithm The extension of the exponential function a* 
(of base a > 0, a € 1) from rational exponents to arbitrary real exponents is 
presented in Sect.7.1. It appears as the unique bijective function f, : R — (0, co) 
that fulfills the following three properties: 


(EXP1) fa(x + y) = fa) fay) for all x, y € R; 

(EXP2) fa(1) = a; 

(EXP3) the function fy is strictly decreasing if a € (0, 1), and strictly increasing if 
a>l. 


As in the case of Proposition 1.3.1, the proof is built on Completeness Axiom. 
The logarithmic function log, of basis a (a > 0, a € 1) is defined as the inverse of 
the exponential function a*. Thus 


log 


a°’a* =x forallx >0 


log,a° =x forallxeR 


and the properties (EXP1)—(EXP3) become in the case of logarithmic function: 


(LOG1) log, xy = log, x +log,y forallx > 0 andy > 0; 

(LOG2) log, a = 1; 

(LOG3) the function log,, is strictly decreasing if a € (0, 1), and strictly increasing 
ifa> 1. 


Due to this fact, we consider reasonable the use in advance of the exponential and 
logarithmic functions (as long as only their basic properties are involved). 

Since R is a field, it must contain the field generated by 0 and 1, which is Q. 
Proposition 1.3.1 allows us to show the existence of irrational numbers, that is, of 
real numbers that are not rational. Indeed, such an example is /2. The argument 
(which goes back to Pythagorean School, sixth century B.C.) is by contradiction. 
Suppose that /2 = a/b, where a, b are nonzero natural numbers with no common 
factor. Then a* = 2b’, and in this case a is a multiple of 2. Then a” is a multiple 
of 4. But 2b? is a multiple of 4 only when b is also even. Therefore a and b have a 
common factor 2, contrary to our assumption. Thus ./2 is irrational. 

The fact that IR is “much larger” than Q receives an argument in Sect. 1.7. 

We illustrate the existence of nth roots with an important inequality comparing 
the arithmetic mean (a, +--+ + dy)/n, and the geometric mean X/a, «++ dy, of any 
n positive numbers a1, ..., dn. 


1.3.3. The Arithmetic Mean—Geometric Mean Inequality (abbreviated, the 
AM-GM Inequality): If a1, ..., dy are positive numbers, then 


ayte:: +a 
n 


Moreover, the inequality is strict except the case where all numbers aj, ..., dy are 
equal to each other. 
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Proof By Mathematical Induction one can prove the inequality for all families of 2? 
numbers (for p > 1). Forn > 2 not of the form 2? for any p, we apply the inequality 
for the family of 2” numbers, 


— 


A, ...,dn, G,...,G , 
—— 


2”—n times 


where G = v/a, +++ dp. 


Two other proofs of the AM-GM inequality, make the objective of Exercises 5 
and 6. The weighted case of this inequality is presented in Theorem 8.9.5 below 
(based on calculus). 


Exercises 


1. Let n > 2 be an integer. Prove that the function x” and its inverse x!/” 


increasing on [0, 00). 


2. Prove the formulas 2/xy = ~/x./y and V/a/x = "X/Xx. 


3. Prove that ie 
a 
(¢ 7 *) > ab 


for all a, b > 0. When does equality occur? 

4. An easy consequence of the Binomial Theorem is Bernoulli’s Inequality, which 
asserts that (1 + x)” > 1+ nx for every n € N and x > 0. Infer that for every 
a> 1 andeveryn € N,n > 2, we have 


are strictly 


5. Suppose that x), x2, ...,X, are positive numbers such that x1x2--+-x, = 1. Prove 
that x} + x2 +---+X, =n. Then, infer the AM-GM Inequality. 

6. Infer from the rearrangement inequality of Hardy-Littlewood-Pélya the AM-GM 
Inequality. 


1.4 The Archimedean Property 


Whenever you try to measure a length using your favorite instrument, one of two 
things is possible: the unit fits an exact number of times or the endpoint of the object 
is somewhere in between two consecutive multiples of the unit. 

In mathematical terms, this is expressed in the next result. 


1.4.1 Theorem (The Archimedean Property) Let u > 0. Then, for every x € R, 
there is a unique integer number n such that 


nu<x < (n+ 1)u. 
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ae. 


Fig. 1.2. The graphs of the functions integer part and respectively fractional part on [—2, 3] 


Proof First, we show that there is g € Z such that x < gu. Assuming the contrary, 
it follows that the set A = {nu : n € Z} is bounded above and the Completeness 
Axiom assures the existence of z = sup A. As zis the least upper bound, there must be 
an integer number m such that mu > z—u, which means (m+ 1)u > z, contradicting 
the choice of z. 

By applying the above remark to —x, we get the existence of an integer number 
p such that pu < x. Consequently, there exist two integer numbers p and q such that 
pu < x < qu. On the other hand, [pu, gu) can be represented as a finite union of 
pairwise disjoint intervals of length u, 


[pu, (p + Iu) U---U L(g — Iu, qu), 


which ensures that x belongs to one (and only one) of the component intervals. 


In particular, when u = 1, we obtain that for every x € R, there is a unique integer 
n such that n < x < n+ 1. This number is called the integer part of x and is denoted 
|x|. Then 
[x] <x< lx] +1 


and thus 
[x] = max{n € Z:n <x} 


for allx € R. 
The difference x — |x], called the fractional part of x, is denoted by {x}. The 
graphs of the functions integer part and fractional part are shown in Fig. 1.2. 


1.4.2 Lemma For every real number x and every integer n, 
[Ix tn] = |x] +7. 


Proof The formula |x|] < x < |x] + l implies |x] +1 <x+n < [x] +n+1, 
from which we conclude that [x + n] = |x] +7 since |x| +n and |x] +n+ 1 are 
consecutive integers. 
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1.4.3 Corollary The equality 
{x + n} = {x} 


holds for every real number x and for every integer n. 


Proof By using Lemma 1.4.2, we infer that 


{x tn} =xtn—|x4+n] 
=x+n—([x] +n) = {x}. 
The property outlined in Corollary 1.4.3 is that of periodicity of the fractional 


part function. Given an arbitrary nonempty set F, a function f : R > E is called 
periodic, if there is a nonzero number T (called period) such that 


f(x+T)=f(x) forallxeR. 


If T is a period of the function f, then nT is also a period, for alln € Z,n 4 0. 
A positive period T is the prime period of the function f if every period of f is a 
multiple of 7. Thus, the prime period (if it exists) is the minimum of the set of all 
positive periods. The fractional part function has the prime period 1. An example of 
periodic function that does not admit a prime period is indicated in Exercise 1. 
Every real-valued function f defined on a bounded interval [a, b] can be extended 
to a periodic function f defined on R provided that f(a) = f(b). Indeed, its extension 
by periodicity is given by the formula 


oy =r (x- |=" | o-a)), xeéER. 


1.4.4 Lemma (The density of Q in R) Every nonempty interval (a, b) contains 
rational numbers. 


Proof By Archimedean Property, there is n € N* such that n (b — a) > 1 and there 
ism € Z such that m/n < a < (m+ 1) /n. Consequently, 


m 1 
a<r=—+-<a+b-a=b, 
non 


where r is a rational number. 

The property asserted in Lemma 1.4.4 leads to the general concept of dense subset. 
A subset A of R is said to be dense (in R) if every nondegenerate interval contains 
at least an element of A or, equivalently, if for every x € R and every ¢ > 0, there 
exists a € A such that 


|x —a| <é. 


There are many other results concerning the density of Q in R. Here is one of 
combinatorial flavor: 


24 1 The Real Numbers 


1.4.5 Theorem (P.G.L. Dirichlet) Suppose that x € R\Q and N € N*. Then there 
arem,n € Z, such that 1 < n < N, and 


| m | 1 
es |. 
n nN 
Proof Since x is irrational, the numbers {x}, {2x}, ..., {Nx} are distinct, irrational, and 


situated in the interval (0, 1). Thus, they belong to the following union of pairwise 


disjoint intervals: 
1 N-1 
0,—})U---U | ——,1}]. 
N N 


If the interval (0, 1/N) contains the number {nx}, then 0 < nx — [nx] < I/N, 
whence 0 < x — [nx] /n < 1/(nN). 

If none of the numbers {x}, {2x}, ..., {Nx} is in the interval (0, 1/N), then two 
of them (say {hx} and {kx}, withh > k) will belong to the same interval. This yields 
{hx} — {kx}| < 1/N, that is, 


|(h — k)x — ({hx] — [kx])| < 1/N. 


Letn = h—k and m = |hx| — [kx]. Then n,m € Z, 1 < n < N and 
|x —m/n| < 1@N). 


The proof of Theorem 1.4.5 illustrates the so-called Dirichlet’s Principle (also 
known as the Pigeonhole Principle): If n + 1 elements belong to the union of n 
disjoint sets, then one of the sets must contain at least two of these elements. 

The Archimedean Property simplifies the computation of the infimum (supre 
mum) of many important sets of numbers. Here is an example: 


1.4.6 Lemma We have i 
inff- : ne N}=0. 
n 


Proof Let u = inf {4 : n € N*}. Clearly, u > 0. If u > 0, then by the Archimedean 
Property there is VN € N* such that Nu > 1. Equivalently, u > 1/N, which contradicts 
the choice of u, and thus u = 0. 


Exercises 


1. (a) Find the prime periods of the functions {2x} and {5}. 
(b) Prove that the characteristic function of Q, 


auf! ifreQ 
XW = 10 ifxeR\Q, 


admits as a period every nonzero rational number and conclude that it has no 
prime period. 
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2. 


Prove the following generalization of the property of density of Q in R : Let 
(Gn)n be a sequence of positive numbers whose infimum is 0. Then the set 
{ma, :m€ Z, n € N} is dense inR. 

Infer from this fact that R \ Q is also dense in R. 


. (Lord Rayleigh). Let a and f be two positive irrational numbers such that 


1/a + 1/8 = 1. Consider the sets 
A={|na|:neEN} and B= {[nB| : ne N}. 


Prove thatA 1 B = {0} andAUB=N. 


. (Applications to Dirichlet’s Principle). Let F be the family of all triplets of integer 


numbers, not all zero, whose absolute values are less than 10°. 
(a) Prove that |a + b\/2 + c\/3| > 1077! for every (a, b,c) € F. 
(b) Prove that there exists (a, b, c) € F such that ja + b/2 + cV3| < 107!!. 


. (a) Prove Kronecker’s Density Theorem: [f a > 0 is an irrational number, then 


the set 
{ma —n: m,n € N} 


is dense in R. 
(b) Infer from this theorem that for any two arithmetic progressions 


aj, a) + dj, a; + 2d), ... 
a2, a2 +. d2, az + 2d), ... 


with a},a2 € R, d; > 0, do > O and d,/d> € R\Q), there is a term of the first 
progression and a term of the second one such that the absolute value of their 
difference is less than 10710. 

Remark For another application of Kronecker’s Density Theorem, see Exercise 8, 
at the end of Sect. 6.5. 


. (Triangle waves) (a) Find the formula of the extension by periodicity of the func- 


tion f(x) = 1—|x|,x € [-1, 1]. Sketch its graph. (b) Do the same for the function 
g(x) = |2x + 1| — 3, x € [-3, 2]. 


. (The Three Gap Theorem of H. Steinhaus). Let a be an irrational number and NV 


a positive integer. Then the points {ka} for k = 0, ...,N — 1, partition the unit 
interval into N subintervals which have at most three distinct lengths, one being 
the sum of the other two. 


1.5 The Decimal Representation 


The Archimedean Property has a multiplicative analogue which can be easily proved 
by adapting the argument of Theorem 1.4.1: 


26 1 The Real Numbers 


1.5.1 Theorem Let p > 1. Then, for every a > 0 there is a unique integer n such 
that 
p" a ee ot ; 
According to this result, every positive number a can be uniquely located in an 
interval [10”, 10"*'). Taking into account the partition 


9 
[10”, 10"+!) = U [k- 10", (k + 1)- 10"), 
k=1 
we infer the existence of a unique d, € {1, ..., 9} such that 


d,- 10" <a < (d, + 1)- 10". 
We call the number 10” the magnitude of a, and d,, the leading digit of a. 
Next, divide the interval [d, - 10”, (d, + 1) - 10”) into 10 closed-open intervals 
of the same length, 
9 
[dn 10", (dn +1) +10") = J [dn 10" + 10", dy - 10" + (+) - 10"), 


from which we get a unique d,_; € {0, ..., 9} such that 
dn» 10" + dy) - 10"! <a <dy- 10" + (dy_1 +1)-10""1. 
Proceeding this way, we construct a sequence (d,—,)x of numbers such that 
dn—x € {0,..., 9} 

and 

dy lO + be 1 Sw ed, <1 he Ge 1 0. 9) 
for all nonnegative integers k. We call (d,_,)x the sequence of digits of x; dy—x 
represents the digit of order 10"—*. The digits of negative order are called decimals. 

The mapping x > (dy_x)x is injective. Indeed, if x and y have the same sequence 
of decimals, then by (1.2) we get 

Ix—y|< 10" forallk>0, 

so by Lemma 1.4.6 we conclude that x — y = 0. 


An important remark is the nonexistence of an integer N such that d,_, = 9 for 
all k => N. Indeed, if the contrary is true, then there exists z such that 
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1 
24 qosntN 2% <2 jomnew +27 [ocmtN=T 
9 9 9 10 _ 1 
z+ 10 nN 10—"+N+1 S*sa4 10-” wt 10—"7+N+1 SE 10-7+N-1 


Thus 


9 9 9 
sup (<+ 1Q—7+N + 10-7441 ee we 
1 


= 24 oontNaT =* S24 Toe * 


which is a contradiction. 
Every sequence (dy_—x)x with dp_, € {0,..., 9}, that does not end in an infinite 


string of nines, is the sequence of digits of some a > 0. For this, consider 


a = sup (4, MO $0 tedp: 10"-*) (1.3) 
k>0 


Indeed, for every k > 0, there is a positive integer m such that d;_zt-m < 8, hence 


dn - 10" eee dn—k . 10”"-* 
< dn - 10” fees t dn—k : 10"-* ae A poner 


ed, 210 decades 10" * 9 (ote es 10"-*-m) 
< dy 10" +++ dye 10" * + 10", 


from which we can conclude that 


dA Ante g 1 Se ed 10 ew Ds 10 


for all k. 
Usually, the representation formula (1.3) is written as 


a =dpy...dy.d_sd_od_3... (1.4) 


and this is the meaning of the infinite decimal expansions of positive numbers. 
Ifa > Ohas the representation (1.4), then the representation of —a is by definition 


== —dj..:dod-jd-3d-4.. 


Also by definition, 
0 = 0.0000000000... 
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The extension of the field operations from the rational numbers to all real numbers 
(expressed as infinite decimal expansions) is based on compatibility with the order, 
and leads to a very familiar construction of R. See [3] for details. 

In practice, the decimal representation of real numbers is used under the form 


a= +10" -0.a;a2a3... 


where n € Z, a), a2, 43, ... are digits and a, 4 0. 
The role played by 10in the infinite decimal representation of real numbers can 
be taken by any integer p > 1. This yields the so-called p-adic representation 


a= +tdn eece dy.d_\d_2d_3 eis 


= xa sup (4, po +--+ +dn—x pi") 
k>0 


where n € Z, d, # 0 and dy, dn—1, dn—2,... belongs to {0,...,p — 1}. Notice 
that the sequence (d,_;)x cannot end in a string of p — 1’s. The case p = 2 is also 
known as the dyadic representation, while the case p = 3 is known as the triadic 
representation. 


Exercises 


1. Prove that rational numbers are distinguished among all real numbers by the fact 
that their infinite decimal expansions are eventually periodic. 


101 
2. Prove that the first 100 decimals of (v 26+ 5) are zeroes. 
3. Compute /0.99...9 (100 nines) within an error of 10~!°, 


1.6 Countable Sets. Uncountable Sets 


The aim of this section is to discuss the problem of comparing sets according to their 
size. The starting point is that of finite sets. A set A is finite if either A = @ or there is 
a bijective function from A onto {1, 2, ..., n}, for some positive integer n. In the first 
case, we say that the cardinality of A is 0, while in the second, the cardinality of A is 
n. Thus, the cardinality of a finite set is the number of its elements. The fact that the 
cardinality of A is n is usually denoted by |A| = 7 or #(A) = n. 

The sets that are not finite are called infinite sets. Clearly, N is an infinite set. More 
examples come through the following simple result: 


1.6.1 Lemma Suppose that A C B. If B is finite, then A is finite, and if A is infinite, 
then B is also infinite. 


As a consequence, the sets Z, Q, and R are infinite. We already noticed that 
the sets are either finite or infinite. How can we distinguish among the infinite sets 
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from the point of view of their size? Cantor’s idea (still in use nowadays) consists in 
searching for a pairing. 


1.6.2 Definition Two sets A and B are cardinally equivalent (or have the same 
cardinality) if there is a bijective function from A onto B. We write |A| = |B| in this 
case. Similarly, we say that the cardinality of A is less than or equal to the cardinality 
of B (that is, |A| < |B]) if there is an injective function from A to B. 


It is easy to verify the following properties: 


|A| = |Al; 
If |A] = |B|, then |B] = |A]; 
If |A| = |B| and |B] = |C], then |A] = |C}. 


1.6.3 Definition A set A is called countable if either A is finite or |A| = |N|. A 
countable set that is also infinite is called countably infinite. The cardinality of the 
countably infinite set N is usually denoted Xo (read aleph nought). 

Any set that is not countable is called uncountable. 


Since the mapping o : N > N*, o(m) = n+ 1, of taking the successor is 
bijective, it follows that N* is also countably infinite. Z is another example of a 
countably infinite set. In fact, a bijection f : N — Z can be defined by 


f(r) = (-)"! | . 


where |-| is the integer part function. 

N x Nis also countably infinite. See Exercise 4. 

The power set P(N), of the set of natural numbers, is uncountable. See Sect. 1.7, 
Exercise 5. 

From the cardinality point of view, the countably infinite sets are the smallest 
infinite sets. 


1.6.4 Lemma Every infinite set has a countably infinite subset. 


Proof LetA be an infinite set. Since A 4 4, we can choose (via Axiom of Choice) an 
element a; € A andconsider the setA; = A\{a;} (which also is infinite). By the same 
argument, we can choose an element az € A, and consider the set Ay = A\{aj, a2}. 
In this way, a countably infinite set {a,, a2, a3, ...} is obtained and by construction, 
it is a subset of A. 


1.6.5 Corollary The union of an infinite set A and a finite set F is a set of the same 
cardinality as A. 


Proof It suffices to consider the case where F consists of a single point ap. The 
general case will then follow by induction. 
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According to Lemma 1.6.4, A contains a countably infinite subset N = {a, a2, 
a3, ...}. A bijection between A and A U {ao} is given by the function 


— jay fx =a, 
LO Fe AN, 


A nonempty set A is countable if and only if it can be enumerated: 
A= {a}, 42, 43,...}. 


This is motivated by the following result. 
1.6.6 Lemma Every subset of N is countable. 


Proof Let X be an infinite subset of N. Since N is well ordered, X admits a smallest 
element, say x9. Then X\ {xo} is also an infinite subset of N and thus admits a smallest 
element x; and so on. The final result is an increasing sequence xo, x1, X2,... of 
elements of X. We have X = {xo, x1, x2, ...} since for every y € X, the set {x © X: 
x < y} is finite, containing say n elements, and thus y = xy. 


1.6.7 Corollary [fA is anonempty set, then the following assertions are equivalent: 


(a) A is countable. 
(b) There is a one-to-one function f : A > N. 
(c) There is an onto function g: N — A. 


Proof (a) implies (b) is clear. 
The fact that (b) implies (c) can be easily checked by choosing an element a € A 
and attaching to f the function 


_ffol@) ifne f(A) 
= ifn € N\f(A). 


For (c) implies (a), notice that if g : N — A is onto, then the function f : A > N 
which associates to each a € A the least element of g~!({a}) is injective. By Lemma 
1.6.6, we infer that f(A) is countable, hence A itself is countable. 


1.6.8 Theorem The countable union (as well as the countable Cartesian product) 
of countable sets is a countable set. 


Proof Let (A;)jey be a family of sets under the hypothesis of Theorem 1.6.8. Then 
Aj = {ai1, Gi2, a3, ...} and U, Ai can also be enumerated. In fact, if the sets A; 
are mutually disjoint, their union will be covered following the circuit as indicated 


below: 
aii > 412 aj3 > 


ge fF . - 
a1 a22,— 93 
17 ¢ 


431 432 
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If the sets A; are not mutually disjoint, we build a new family of sets 
By = Aj, By = A2\By, Bz = A3\(B) UB), ... 
The sets B; are mutually disjoint and have the same union as the sets A;. Consequently, 


we may apply to them the precedent reasoning and the proof is done. 
The proof in the case of countable Cartesian products is similar. 


1.6.9 Corollary The set Q of rational numbers is countable, and the same is true 


forQ x Q, Q x Q x Qand so on. 
Proof In fact, Q= Unen«{% : m € Z} and Theorem 1.6.8 applies. 


Exercises 


1. Unclusion—Exclusion Principle). Consider the finite sets A;, A2..., An. Show that 


|Ay UA) U--- UA,| => Ail — > |Ai 4\| as by |AiN Aj NAg| 
i i<j i<j<k 


feet (-1y"" JA, NA2M-+-NAg|. 


2. Suppose that A and B are nonempty finite sets having respectively m and n ele- 
ments. Prove that: 
(a) the set of all functions from A to B has n” elements; 
(b) if m = 2, the set of all bijective functions from A to B has m! elements; 
(c) if m <n, the set of all injective functions from A to B has a number of 
n(n — 1)---(n—m-+ 1) elements; 
(d) if m > n, the set of all surjective functions from A to B has 


m n 1)” n ym n 3m 1 n—1 n 
a —()a-1"+( )@-a"-( J@-ayrte tat( E 


elements. 


3. (R. Dedekind). Infer from Lemma 1.6.4 that a set is infinite if and only if it has 
the same cardinality as a proper subset of itself. 

4. Verify that the function g : N x N > N* defined by g(m, n) = 2/"(2n + 1) is 
bijective and conclude that N x N is a countably infinite set. 

5. (Calkin and Wilf [4]). An explicit enumeration of the positive rational num- 
bers can be produced by concatenating blocks of 2” numbers (n € N). Start 


with Bo = } and continue with the blocks B,, Bo, B3,..., following the rule 
Bn = (55, At: 2 € By-1} forn > 1: 
112132314352534 
1°2°?1°? 3°2?3° 1? 4°3°5°2?5°3'° 41 
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Prove that the numerator and denominator of each ratio 2 € By, are relatively 
prime, every reduced positive rational number occurs in some block, and no 
reduced positive rational number occurs at more than one block. 


1.7 The Cardinality of R 


We know that R is larger than Q, but is its cardinality greater than 8o? We shall 
prove that the answer is affirmative. The basic ingredient is the following result: 


1.7.1 Nested Intervals Lemma (G. Cantor) Consider a nested sequence of 
bounded and closed intervals, 


[ao, bo] D [a1, 61] D [a2, b2] D---. 


Then its intersection (| [an, by] is nonempty. 
neN 


Moreover, the intersection consists of exactly one point if and only if for every 
€ > 0, there exists a natural number N such that by — ay < €. 


Proof Notice first that every left endpoint a; is less than or equal to every right 
endpoint bj. In fact, if for example i < j, then the inclusion [a;, b;] D [a;, bj] yields 
qj < aj < bj. 

Letting j arbitrarily fixed, we infer from the Completeness Axiom the existence of 
a = sup qj and also the fact that a < b;. As j was arbitrarily fixed, a similar reasoning 
yields the existence of b = inf b; and the inequality a < b. A moment’s reflection 
shows that 
() (an, bn = fa, 51 


neN 


particularly, the intersection is nonempty. The intersection consists of exactly one 
point when a = b. The condition that appears in the statement of Nested Intervals 
Lemma is precisely a reformulation of this fact. 


1.7.2 Corollary The unit interval (0, 1] is uncountable. 


Proof Suppose to the contrary that the interval [0, 1] is countable, that is, [0, 1] = 
{a 1, 42, a3, ...}. Divide the interval J = [0,1] into three subintervals [0, 1/3], 
[1/3, 2/3], and [2/3, 1]. Let ; be one of these subintervals that does not contain a1. 
Then divide /; to obtain three new closed subintervals of equal length and denote by 
Ty one of them that does not contain a2. Continuing this way, we get a nested sequence 
of closed intervals, Jo > I; D Ih D --- According to the Nested Intervals Lemma, 
the intersection 1,,cNJ, should be nonempty. However, no element a € Ny,enJ, may 
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occur in the list aj, a2, a3, ... This contradiction shows that the unit interval is not 
countable. CO 


Corollary 1.7.2 imposes the following concept: 
1.7.3 Definition A set that is cardinally equivalent with the interval [0, 1] is called 
a set of cardinality c. 

Every nondegenerate interval [a, b] can be put in a bijective correspondence with 


the interval [0, 1], for example, via the function 


X—a 


b-— 


g(x) = 


Consequently, every such interval has the cardinality c. According to Corollary 1.6.5, 
the intervals [a, b), (a, b] and (a, b) have the same cardinality. A bijective function 
between R and (—1, 1) is given by the formula 


One can easily exhibit examples of bijective functions relating the unbounded inter- 
vals (—oo, b), (—oo, b], (a, 00) and [a, oo) to one of the bounded intervals (0, 1) 
and [0, 1). See Exercise 2. This yields the following result: 


1.7.4 Theorem (Cantor) R (and every nondegenerate interval) has the cardinality c. 


The following result reduces the existence of a bijection between two sets A and B 
to the existence of injections from A to B and from B to A. 


1.7.5 Cantor-Schréder-Bernstein Theorem /f A and B are two sets such that 
|A| < |B| and |B| < |Al, then |A| = |B]. 


Proof Letf :A— Band g: B —> A be two injective functions. We attach to them 
a new function 


pg: P(A) > P(A), 9X) =CGCf))). 


Let D = {X : X © P(A), X C g(X)} and D = UyepX. Noticing that X C Y 
implies g(X) C g(Y), we infer that g(D) = D, that is, CD= g(Cf (D)). Then the 


function 
f@) ifxeD 


Beta DSTO Aah ave 0D 


proves to be bijective. 


Since R = QU (R\Q) and Q is only countable, it follows that the set of irrational 
numbers has cardinality c. 
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Exercises 


1. (Helly’s Theorem for intervals). Let (a)aca be a family of nonempty intervals 
such that Ig N/g #4 6 fora & f. Prove that their intersection is nonempty in each 
of the following two cases: 

(a) A is anonempty finite set; 
(b) All intervals Jy are bounded and closed. 


2. (a) Exhibit examples of bijective functions f : [a, oo) — [0, 1) that map a into 0. 
(b) Prove that all unbounded intervals of R are cardinally equivalent to [0, 1]. 

3. Let (Xn), be a sequence of nonzero real numbers. Prove that: 

(a) there is a € R such that ax, € R \ Q for every index n; 
(b) there is 6 € R such that 6 + x, € R \ Q for every index n. 

4. (G. Cantor). Prove that the power set P(T) of any set T is cardinally equivalent 
to the set 2”, of all functions defined on T and taking values in {0, 1} (or in any 
set containing two elements). 

[Hint: Use the mapping A — x4, where the characteristic function x4 is viewed 
as a function from T to {0, 1}.] 

One can prove, using the Cantor-Schréder-Bernstein Theorem, that P(N) and 
[0, 1] have the same cardinality, c. 

5. (G. Cantor). Prove that |A] < |P(A)| and |A| 4 |P(A)| for every set A. In 

particular, P(N) is uncountable. 
[Hint: The nontrivial case is where A is nonempty. Then the function f(x) = {x} 
provides a one-to-one map from A to P(A), hence |A| < |P(A)|. Assuming the 
existence of a bijection g : A — P(A), we have to consider the set X = {a: 
aéAanda ¢ g(a)}. Then X = g(x) for some x € A but then neither x € X nor 
x ¢ X, which is a contradiction. ] 


According to Exercise 4, |P(N)| = Pa |. On the other hand, from the proof of 
Theorem 5.5.8 below and the Cantor-Schréder-Bernstein Theorem, one can infer 
that jas] = c. Therefore |P(N)| = c. 


6. (R. Dedekind). Prove that the set of polynomials with integer coefficients is 
countable. 
[Hint: The height of a polynomial P(x) = a,x" + ---+ ax + ao of degree n is 
defined by ||P|| = |an|+---+|a1|+|ao|-+n. Notice that for each positive integer 
n, the set Ay of all polynomials P with integer coefficients that satisfy ||P|| <n, 
is finite. ] 


A real number is called algebraic if it is the root of a nonzero polynomial with 
rational coefficients. The set of algebraic numbers constitutes a subfield of R, denoted 
A. By Theorem 1.6.8 and the precedent exercise, it follows that A is countable. Think 
of the Fundamental Theorem of Algebra. The real numbers that are not algebraic 
are called transcendental numbers. Since the countable sets are thinner than those 
of cardinality c, it follows (according to the Axiom of choice) that there do exist 
transcendental numbers! 
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What makes the above reasoning (published by G. Cantor in 1874) remarkable is 
the fact that it allows to conclude that a certain set is nonempty without indicating any 
concrete element of it! The first examples of transcendental numbers were given by 
J. Liouville in 1844. Actually, he proved a criterion of transcendentality, by observing 
that the irrational algebraic numbers cannot be very well approximated by rationals. 
Precisely, if a is such a number, then there exists a constant C > 0 and an integer 
d > 2 such that 


for all integers m and n with n > 0. See [5], Lemma 2.2, p. 7, for details. 
For example, v3 —_m 


al <4, for all integers m and n withn > 0. 


= $72 

Call a Liouville number every real number @ such that for all C, d > 0 there exist 
integers m and n with n > O such that law = | < £. By the discussion above, every 
Liouville number is transcendent. An example is a = 0.a,a2a3..., where a, = 1 


ifn = m! (m € N) and a, = 0 otherwise. 


1.8 Notes and Remarks 


The notion of real number came as a result of a long and elaborated process, orig- 
inating in the desire of man to get the best possible perception of the surrounding 
world. It was only in 1858, when the mathematician Richard Dedekind answered the 
long-standing question “What are the real numbers?’, by defining them as suitable 
sets of rational numbers, called Dedekind cuts of Q. A Dedekind cut is a nonempty 
proper subset a of Q that does not contain a greatest element and 


p€a,q € Qand gq < pimplies g € a. 
For example, every rational number r determines a Dedekind cut 
m= {peQ:p<r}, 


and in this case r = supr*. 
For every Dedekind cut a, 


q € Qandg ¢a implies p < q forallpea. 


We denote by PR the set of all Dedekind cuts of Q (that is, Dedekind’s model of 
R). PR can be endowed with a strict order relation < as follows: 


a < B (equivalently, 6 > a) if there is a rational number p € f\q. 
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We puta < 6 (equivalently, 8 > a) if either a < 6 ora = B. The equality of 
Dedekind cuts is understood as the equality of sets. 

It is easy to see that the relation < satisfies the Order Axioms 1.2.4 and Total 
Ordering Axiom 1.2.6. 

The addition of Dedekind cuts is defined by the formula 


a+p=y 


where y = {r€ Q:r=p+q, pea, qé Pp}. The neutral element is 0*. 
The negative of a Dedekind cut @ is defined by 


—a = {p €Q:p < —x fora suitable x e Q\ a}. 
The product of the Dedekind cuts a > 0* and f > 0* is defined by the formula 
ap=y, 


where y = {x € Q:x < Oorx=pg for some p € a, g € B withp > 0, g > O}. 
The other cases can be reduced to this one. For example, if a > 0* and B < 0* 
we define 


ap = — (a(—f)). 


The details of Dedekind’s construction of R can be found in the book of Rudin [6]. 
The more traditional way of defining the real numbers as infinite decimal expansions 
is discussed by Davidson and Donsig in [3]. 

Cantor’s construction of R (as the set of classes of equivalence of Cauchy 
sequences of rationals) is presented by Hewitt and Stromberg [7]. An idea of this 
construction is offered in the proof of Theorem 5.7.1. 

All constructions of R provide Dedekind complete ordered fields, that is, fields 
endowed with a total ordering < of their elements that verify the Axioms 1.2.1— 
1.2.7 (as stated in Sect. 1.2). It turns out that all such fields are algebraically and 
order isomorphic. More precisely, for any two Dedekind complete ordered fields Fy 
and F2 with sets of positive elements P; and P2 respectively, there exists a unique 
one-to-one mapping ¢ of F; onto F2 such that 


g(x + y) = 9) + 9) 
p(xy) = Pel) 
g(x) € P2 if and only ifx € P). 


See the monograph of Hewitt and Stromberg [7], Theorem 5.34, for details. 

The first explicit formulation of the Principle of Mathematical Induction was given 
by Blaise Pascal in his Traité du triangle arithmétique (1665). The modern treatment 
of the principle came only in the nineteenth century, with George Boole, Augustus 
De Morgan, Giuseppe Peano, and Richard Dedekind. 
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The AM-GM inequality was first proved by Colin Maclaurin, but the argument 
given in text is due to Cauchy [8], Theorem 17, pp. 457-459. 

An account on the computation of square roots in ancient India (in the context of 
the discovery of positional decimal arithmetic with zero) can be found in the paper 
by Bailey and Borwein [9]. See also Sect. 2.2, Exercise 2. 

The radicals proved important in connection with the “compact formulas” for the 
roots of polynomials of degree <4. In 1824, Niels Henrik Abel proved the impossi- 
bility of solving the quintic equations in radicals. Eventually that led to the modern 
approach of the problem of solving equations which consists of a qualitative part, 
concerning the existence and uniqueness of solutions, and a quantitative part, devoted 
to numerical algorithms for fast approximation of solutions. Some ideas are offered 
in Sects. 6.4 and 8.4 and Appendix A. 

Most of the main results in one variable Real Analysis can be seen as direct 
consequences of Dedekind’s completeness axiom. The Nested Interval Property + 
Archimedean Property actually implies this axiom and the same is true for many 
other results that are presented in the following chapters: Convergence of bounded 
monotonic sequences; Cauchy Completeness; Bolzano—Weierstrass Theorem; Inter- 
mediate Value Theorem, etc. See the paper of Teismann [10] for details. For other 
unifying principles of Real Analysis of one variable see the papers of Gordon [11], 
and Moss and Roberts [12]. 

The entire theory on cardinal equivalence exposed in Sect. 1.6 is due to Cantor, 
the founder of set theory. His proof on the existence of transcendental numbers 
(mentioned after Exercise 5 in Sect. 1.6), was controversial in the epoch and took 
time to be accepted. The Cantor-Schr6éder-Bernstein Theorem was only intuited by 
George Cantor, the first valid proofs being given by Richard Dedekind and Felix 
Bernstein. 

How many points on a line are there? Is there any other cardinality between Xo and 
c? In 1964 came a surprising answer by Paul J. Cohen: The Continuum Hypothesis 
(that is, the assertion that between Xo and c there are no cardinals) is independent 
from the axiomatic system ZFC of set theory. Thus, one cannot solve the status of 
continuum hypothesis by using the current methods of set theory. 


References 


1. Halmos, P.R.: Naive Set Theory. Springer, Heidelberg (1974) 

2. Marek, V.W., Mycielski, J.: Foundations of mathematics in the twentieth century. Am. Math. 
Mon. 108, 449-468 (2001) 

3. Davidson, K.R., Donsig, A.P.: Real Analysis with Real Applications. Prentice Hall, New Jersey 

(2002) 

Calkin, N., Wilf, H.S.: Recounting the rationals. Am. Math. Mon. 107(4), 360-363 (2000) 

Oxtoby, J.: Measure and Category. Springer, Heidelberg (1971) 

Rudin, W.: Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, New York (1976) 

Hewitt, E., Stromberg, K.: Real and Abstract Analysis (Second printing corrected). Springer, 

Heidelberg (1969) 


SO) ve 


38 


1 The Real Numbers 


Cauchy, A.-L.: Cours d’Analyse de l’Ecole Royale Polytechnique, I &re partie, Analyse 
Algébrique. In: Gabay, J. (ed.) Reprinted. Paris (1821), (1989) 

Bailey, D.H., Borwein, J.M.: Ancient indian square roots: an exercise in forensic paleo- 
mathematics. Am. Math. Mon. 119, 646-657 (2012) 

Teismann, H.: Toward a more complete list of completeness axioms. Am. Math. Mon. 120(2), 
99-114 (2013) 


. Gordon, R.A.: The use of tagged partitions in elementary real analysis. Am. Math. Mon. 105(2), 


107-117 (1998) 


. Moss, R.M.F., Roberts, G.T.: A creeping lemma. Am. Math. Mon. 75, 649-651 (1968) 


Chapter 2 
Limits of Real Sequences 


Any real number is made accessible through its rational approximations, for 
example, cutting off the decimals starting with (m + 1)th one. As n increases, these 
approximations come closer to the given real number, a process that lies at the heart 
of the subject of convergence. The study of many algorithms (such as the Babylonian 
algorithm for extracting the square root) needs some theoretical considerations of 
convergence and limits, which can be found in this chapter. 


2.1 Convergent Sequences 


The notion of a sequence of real numbers is motivated by the various algorithms 
that make available a certain object by its successive approximations in a class of 
well-behaved objects. From this point of view, the main problem in connection with 
a sequence is its behavior for large values of indices. 


2.1.1 Definition A sequence (a,)n>0 is called convergent to the number £ (abbrevi- 
ated, a, — £) if foreach e > 0, there is a natural number N such that for alln > N, 
we have |a, — &| < &. 


In other words, a, — & means that for each e > 0, there is an index N such that 
for alln > N, we have 
L-—e <a, <l+e. 


The real number £, which appears in the previous definition, if it exists, is unique. 
In fact, if a, > £ anda, — £', then for each ¢ > 0 there is an index N such that 
lan — €| < €/2 and |a, — £’| < €/2 for alln > N. This yields 


|e-2'| = |= ay + lan —e'| 
<é/2+e6/2=8e 
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for all n > N. Since ¢ > 0 was arbitrarily fixed, we get 
|¢—2’| < inf {e : e > 0} =0, 


and thus £ — ¢’ = 0. 
We call the number ¢ which appears in Definition 2.1.1 the limit of the sequence 
(ay )n and denote the convergence of (ay) to £ also by 


lim a, = 
n—->oo 
(read: the limit of a, as n tends to infinity equals £). The precise meaning of the 
phrase “‘n tends to infinity” is clarified at the end of Sect.2.5, where we discuss the 
sequences with infinite limits. 
Intuitively, the convergence means that the terms a, become arbitrarily close to 
the limit for sufficiently large. 
The convergent sequences are the sequences (d,), for which there exists a real 
number £ such that a, > @. 
The sequences that are not convergent are said to be divergent. 
The theory of convergent sequences can be adapted mutatis mutandis to the case 
of sequences indexed over sets of the form {n € Z:n > no}. 
Every constant sequence is convergent (precisely, to the common value of its 
terms). 
The sequence ns is convergent to 0 since |+ _ 0| < éforalln > [4] +1. 
More generally, 


— 0 for alla > 0. 
n 


The alternating sequence ((—1)”),, is bounded and divergent. 
The next result is an easy consequence of the definition of convergence. 


2.1.2 Lemma /f two sequences differ in a finite number of terms, then they have the 
same status from the point of view of convergence (and the same limit when they are 
convergent). 


Actually, more is true. If we change, add, or remove a finite number of terms of 
a sequence, then the resulting sequence has the same status. 


2.1.3 Lemma Every subsequence of a convergent sequence is convergent to the 
same limit. 


Proof Suppose that a, — ¢ and (aj,,)n is a subsequence of (dy, )n. We need to show 
that a, > £. 

Let e > 0. By our hypothesis, there exists an index N such that for alln > N, we 
have |a, — €| < ¢. Since k, > n for every n, it follows that |ak, _ e| < e€ whenever 
n => N. Consequently, ag, > €. 
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2.1.4 Lemma A sequence (ay)n converges to a number £ if and only if every 
subsequence of it contains a subsequence that is convergent to ¢. 


Proof The necessity follows from Lemma 2.1.3. The sufficiency part can be argued 
by reductio ad absurdum. If (a,), is not convergent to @, then (by negating in 
Definition 2.1.1) we infer that there exist ¢ > 0 and a subsequence (ax, )n such 
that 

lax, —£|>e foralln EN. 


Or, in this case (ax, ), has no subsequence convergent to €, in contradiction with 
our hypothesis. Consequently, a, — @. 


2.1.5 Lemma Every convergent sequence is bounded. 


Proof Suppose that a, — £. According to the definition of convergence, for e = | 
there exists N such that 


la, —€| <1 foralln>N, 


equivalently, 
€-l<a,<+4+1 foralln>N. 


Then 


inf {a9,a,,...,an,€— 1} <a, < sup {ao, aj,...,ay,£+4+ 1} 


for all n € N, and thus the sequence (a,),, is bounded. 


As already noticed, a bounded sequence may not be convergent. However, as we 
show in Theorem 2.2.4, it always admits convergent subsequences. 


2.1.6 Lemma [f (ay) converges to €, then (\|an|)n converges to |t|. 


Proof This is an immediate consequence of the following inequality: 


lan] — |€l| < lan — £1. 


2.1.7 Corollary A sequence (dy)n is convergent to 0 if and only if the sequence 
(lan|)n, of its absolute values converges to 0. 


An important issue of Lemma2.1.6 is the possibility to reduce the theory of 
convergent sequences to the case of positive sequences convergent to zero. In fact, 


adn > € ifandonlyif |a, —¢|— 0. 


Moreover, in order to establish the convergence a, — €, it suffices to find a 
suitable sequence (b,), such that 
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lan —€| < by > 0. 
We illustrate this idea in our discussion on the algebraic operations with convergent 
sequences. 


2.1.8 Theorem (a) Let (ay), and (by), be two convergent sequences such that 
An > Land b, => L'. Then: 

(a) (dy + bn)n converges tol + £'. 

(b) (Qnbn)n converges to Lt". 


Proof (a) Let ¢ > 0. Then, starting with an index N, 
lan —€| <e/2 and |b, — £'| < e/2 
and thus for every n > N, we get 


|(an + bn) — (0+ £')| < lan — C1 + [bn — "| 
<eée/2+e/2=6. 
(b) According to Lemma2.1.5, there exists a number A > 0 such that |a,| < A 


for alln. Let e > 0. The convergence of (ay), to € and the convergence of (by), to ’ 
gives us anumber WN such that 


E 
apf ec 
lan |< 20) +1 


for every n > N. Then, 
Anbn — LE'| < |anbn — an€’| + |an€! — €0'| = lanl - [bn — | + [€'| - lan — 21 


<Alb,— | + || + lan — I 
<eée/2+e/2=6 


for every n > N, that is, anb, > CE’. 


2.1.9 Corollary If (a,)n converges to £ and a is a real number, then aay, > ae. 


2.1.10 Theorem [f (an)n converges to € and 4 0, then a, 4 0 starting with some 


integer N and 
1 


—>-. 
an £ 

Proof By Lemma2.1.6, |a,| — ||. According to the definition of convergence, 
for ¢ = |€|/2 there must exist N such that ||a,| — |€|| < |@|/2 for every n > N. 
Consequently, for n > N we get 


le] 


Zo 


ais 
- 2 
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and thus 


|, 


1 lan —€| 2 


= < lan 
fig lan||€| ~ |e|? 


which yields 1/a, — 1/€. 


From Lemma2.1.6 and Theorem2.1.8(a) we can infer easily the order pro- 
perties of convergent sequences. 


2.1.11 Theorem (Order properties of convergent sequences) 

(a) If (an) n is a sequence of nonnegative numbers, convergent to £, then € is also 
a nonnegative number. 

(b) If (dn)n converges to £, (bn)n converges to t' and ay < by for everyn € N, 
then £ < £'. 


Notice that even if all terms of a convergent sequence are positive, the limit may 
be zero. 


2.1.12 The Squeeze Theorem Let (ay)n, (bn)n, and (Cn)n be three sequences such 
that 
an < bn < cn 


for all n. If (ay) and (Cn)n are convergent to the same limit €, then (by), also is 
convergent to £. 


Proof In fact, |b, — €| < sup {|a, — £|, |cn — €|} for alln € N. 


2.1.13 Corollary The product of a sequence convergent to 0 by a bounded sequence 
is also a sequence convergent to 0. 


We illustrate the Squeeze Theorem by proving that the sequence (2/n),,>2 is 
convergent to 1. 

In fact, 2/n = 1+ 7», where r, > 0. We show that r, — 0. For, note that for all 
n > 2 we have 


=(tmy"=(")4 (mt (2) 4-4 (Ye 
n= Tn = 0 1}™ g)}in n}i” 
> (a=, 
2 2 


so that 0 < 7 < 2/(n — 1), that is,0 < r, < /2/(7— 1). As (/2/(1 — 1))y is 
convergent to 0, we can apply Theorem 2.1.12 to conclude that r,, — 0 too. 


2.1.14 Remark In connection with the notions of boundedness and convergence, we 
may consider the following linear spaces: 
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Fo(N, R), the space of all sequences of real numbers, having only finitely many 
nonzero terms; 

co(N, R), the space of all sequences of real numbers convergent to 0; 

c(N, R), the space of all convergent sequences of real numbers; 

£°(N, R), the space of all bounded sequences of real numbers. 


The linear operations are inherited from the space F(N, R) of all real-valued 
sequences, and are defined by the formulas 


(an)n + (ba)n = (Qn + bn)n 


A(an)n = (Can)n. 


According to Remark 1.2.9 and Lemma 2.1.6, each of these spaces is an example of 
linear lattice of functions (defined on N) with respect to the ordering 


(Qn)n < (bn)n if and only if a, < by, for all n. 
An important role in real analysis on intervals is played by positive sequences, 
that is, by sequences (a;,), such that a, > 0 for all indices n. 


Exercises 


1. Prove that a sequence of integer numbers is convergent if and only if itis eventually 
constant (that is, constant from some index onward). 
2. Suppose that (ay,)n is a sequence convergent to £. Prove that (dg(n))n is also 
convergent to £, whatever is 0 : N > N an injective map. 
. Compute lim (eter + hes +--+ + shee). 
peer P noo \Vn?+1 a Vn?242 + + Jn2+n 


4. Prove that if (a,), converges to € and (b,), converges to ¢’, then 
max {a,, b,} — max {é, et and min {a,, b,} — min {é, #} . 


[Hint: See the formulas max {a,, by, } = 5 (dn + by + lan — by |) and 
min {an, bn} = 5 (an + bn — lan — bnl).] 

5. Let (dy), be a strictly increasing sequence of positive numbers, in arithmetic 
progression. 


(a) Prove that ./a2x—142K41 < a2 for all k € N*. 
(b) Infer from the preceding result that 


a| 43 42n-1 / a 
= see < 
a2 a4 a2n Q2n+1 


for all n and conclude that lim x, = 0. 
n—oo 


6. Prove that lim %/c = 1 for every c > 0. 
n—-oo 
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7. Let aj,...,@p,C1,..., Cp be positive numbers. Prove that 
I I p 
. 7 i 1/n 
im. (cia +--+ + cpa") = max {aj,..., ap}. 


2.2 Monotone Convergence Theorem 


One of the best known criteria for convergence of sequences of real numbers is the 
following: 


2.2.1 Monotone Convergence Theorem Every increasing and bounded above 
sequence of real numbers is convergent to its least upper bound. 


Analogously, every decreasing and bounded below sequence is convergent to its 
greatest lower bound. 


Proof Let (a;)n be a bounded above and increasing sequence, and let ¢ be its least 
upper bound. We show that (a;)n is convergent to €. 

Let e¢ > 0. By the definition of least upper bound we infer the existence of an 
index N such that £ — ¢ < ay < €. Since the sequence is increasing, it follows that 


l-—e<an <a,<l<l+e 


for all n => N, and the proof is done. 


2.2.2 Examples (a) The convergence to 0 of the sequence (+) n>1 (fora > 0) canbe 
seen as a consequence of the Monotone Convergence Theorem. In fact, this sequence 
is decreasing and its infimum is 0. 

(b) Let a € R. The sequence (a”),>1 is convergent if a € (—1, 1] and divergent 
if a € (—oo, —1] U C1, 0&). Moreover, 


7 0 if jal <1 
PT ai gees Ie 


The case a = 1 is trivial. If |a| < 1, then the sequence of general term a, = |a|” 
is decreasing and bounded below by 0. Therefore, by the Monotone Convergence 
Theorem it is convergent. Let £ be its limit. Since 


Gn+1 = |a|-a, foralln 


we get, by taking the limit on both sides, that € = |a|- &, from which it follows 
that £ = 0. An application of Corollary 2.1.7 concludes the proof in the case where 
aeé (1,1). 

When a € (—oo, —1] U (1, 0), we have |a,41 — an| = |a|" |a — 1] = Ja -— 1 
for all n, and in this case the sequence (a”), is divergent. 
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(c) The summation of positive sequences is also an illustration of the Monotone 
Convergence Theorem. If (a,,), is such a sequence, we attach to it the sequence of 
partial sums 

So = ao, S} =ao + a1, S82 =ao +a, +a3,... 


which is increasing. We say that (dy) is summable, with sum S, if the sequence of 
its partial sums converges to S. Usually, this is denoted by 


lee) 
ag taj +aat+-:--=S or > an = S. 
=] 


For example, if 0 < q < 1, then 


l+qtq tes : 
l—q 

(d) The decimal representation of a real number is another illustration of this 
topic. Indeed, the fact that 


a= dy eae .do.d_,d_2d_3 aren 


= sup (4, -10" +--+ +dy_p- 10"-*) 
k>0 


is equivalent to 


ges Pa (4, 10" +++ -+dy—x- 10"-*) 
k—+0oo 


St AO id <9 10 a Is 


We will come back to the problem of summing numerical sequences in Chap. 4. 

As already noticed, every convergent sequence is bounded and there exist bounded 
sequences that are not convergent. However, every bounded sequence has a conver- 
gent subsequence. 


2.2.3 Lemma _ Every sequence has a monotone subsequence. 


Proof Let us consider the set A = {i : aj < aj; forall j > i} : 

There are two possibilities: 

Case 1. A is an infinite set of elements ig < ij < iz < ---. Then, according to 
the definition of A, dig < aj, < di, <--:. 

Case 2. A is a finite set. Put ig = sup A + 1, if A is nonempty and ig = O if A is 
empty. Since io ¢ A, there exists ij > ig such that aj, > aj,. Also, ij ¢ A, so that 
there exists iz > i; such that a;, > a;, and so on. 


By the Monotone Convergence Theorem and Lemma 2.2.3 we infer: 
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2.2.4 Bolzano-Weierstrass Theorem Every bounded sequence has a convergent 
subsequence. 


2.2.5 Corollary Jf a bounded sequence does not converge to a number £, then a 
subsequence of it converges to a limit which is different from €. 


Proof This follows from Lemma 2.1.4 and Bolzano-Weierstrass Theorem. 


Exercises 


1. Consider the sequence (dy )n>1 defined by 


eo oe eee 
—_—______——- 


n roots 


where a > 0 is a parameter. Prove that the sequence is convergent and indicate 
its limit. 
[Hint: Notice that aj < az and ayj41) = a+ a, forn > 1.] 
2. (The Babylonian Algorithm for computing square roots). 
(a) Prove that for every positive numbers ag and a, the sequence (a), defined 


by 
1 
Anti = 5 (« + <) forn > 0 
n 


is convergent to ./a. 
(b) Consider the particular case where a9 = 3 and a = 8. Show that a> i= 8 


< (a? — 8)” /32, and thus 
0< az =a 


for all n > 1. Then infer that 0 < a, — V8 < 10-10-32", for all n > 1. Since 
10 - 10732" < lige 2). this shows that a, provides an approximation 
of ./8 with at least 3 - 2”—~2 — 2 exact decimals, whenever n > 2. 
(c) Using a computer, one can easily see that the upper bound of a, — V’8 indicated 
at point (b) is pretty rough. Find a better upper bound. 

3. (The continued fraction expansion of the golden ratio). Consider the sequence 


1 1 
= 1, =] =| 
ao a} oy? 5 a2 "ae 


whose terms are related by the formula a,4; = 1 + 7- ! forn > 0. 
(a) Prove that both subsequences (a2, )y and (d2n+1)n ae monotone and bounded. 


(b) Infer that (a,), is convergent to the golden ratio g = Es. 
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In connection with Exercise 3, note that g = lim Fr where (F,)y is the 
n>oo “7” 


sequence of Fibonacci numbers. 
The sequences associated to a function F : X — X via the formula 


xX0O=d, Xw41=F(,) forn>0 


are usually called recurrent sequences. If X is an interval and F is an increas- 
ing function, then the associated recurrent sequences are monotone, while if F 
is decreasing, the subsequences (x2,), and (X2n+41)n are monotone (of oppo- 
site monotonicity). See Appendix A for additional information on recurrent 
sequences. 

The next exercise is an example of recurrent sequence in R*, presented with the 
means of one real variable analysis: 

(Gauss’ Arithmetic-Geometric Mean). Let a and b be two numbers such that 
a > b> 0. Prove that the sequences (x,)n and (yy)n defined by 


xo=a, yw=d, 


Xn + 
a Yntl = JSXnYn forn > 0 


Xntl = 


are convergent and they have a common limit M (a, b). Follow the next steps: 
(a) Xn > yy for all n. 

(b) The sequence (x,), is decreasing, while the sequence (y,), 18 increasing. 
Conclude from here that both are convergent. 

(c) Take the limit of the recursive relation x,+1 = (%, + yp) /2 to obtain that the 
two sequences (x,), and (y,,), have the same limit. 

An application to integral calculus of Gauss’ Arithmetic-Geometric Mean makes 
the objective of Exercise 13, Sect.9.2. 

Prove that every real number is the limit of a strictly increasing sequence of 
rational numbers and of a strictly decreasing sequence of rational numbers; 
a similar statement is true when rationals are replaced by irrationals. 

Infer the Bolzano-Weierstrass Theorem from the Nested Intervals Lemma. 
[Hint: If (an)n is included in [a, b] and c = (a + b)/2, then at least one of the 
subintervals [a, c] and [c, b] contains a subsequence (a?) n Of (ay)n. Then iterate 
the argument and conclude that a” )n is convergent. ] 


2.3 The Number e 


Consider the sequences 


1\" 1 1 
a= \1l+-— and bh =1+—+---+— 
n 1! n! 


indexed over n > |. We show that the two sequences above have a common limit, 
usually denoted e. 
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2.3.1 Lemma The sequence (ay)n>1 is strictly increasing. 


Proof According to the AM-GM inequality, 


1" 1 1 l+n(i+i 1 
n+1 1 = n+l 7 Le ee ick < ( 7) = 1+ , 
n n n n+1 n+1 


whence ay < dy+1. 


2.3.2 Lemma We have ay, < by < 3 for everyn > 2. 


Proof In fact, 


ee n i+ n a n Ls as n 1 
— = = oS an wreck = 
: n 1 n 2) n2 n n” 


1 n(tn-1) 1 n(n—1)(n—2)...2-1 1 
=1 : ‘ eae : 
ane ra 2! ries at n! n” 
1 1 
<1+—4+---+—=b), 
1! ! 
On the other hand, 
by =14+—+ + : =2+-—+ : + + : 
—- 1! 1 a2 Di Bicon 
1 1 1 


Lemmas 2.3.1 and 2.3.2 show that the sequence ((1 + 1)") verifies the 
n 


Monotone Convergence Theorem and thus it is convergent. 


2.3.3 Definition The number e is defined as the limit of the sequence ((1 + 1)") 

The number e was introduced by Jacob Bernoulli in connection with a question on 
compound interest. The notation e is due to Leonhard Euler, who actually revealed 
its importance in mathematics. 

The number e is used as the base of natural logarithms. 

We show that e is also the limit of the sequence (b;,), above. 

In fact, this sequence is strictly increasing and bounded above, so it also obeys 
the Monotone Convergence Theorem. Let £ be its limit. By Lemma 2.3.2, we infer 
that 2 > e. For N € N arbitrary fixed andn > N, we have 


(2242 (S24 uke ie i 

ay, = — —_ -_ eae — — — eee — 

i 1! 2! n n! n n 
1 1 


| 

— 
+ 
= 
+. 
N | 
—— 
— 

| 
[eRe 
—— 
+ 
+ 
%|- 
——T 
— 

| 
i 
— 
a 
— 

| 

= 
S| | 

— 
——” 
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so that, passing to the limit as n — oo, we get thate > by. Since N was arbitrarily 
fixed, this yields also e > £. Consequently, ¢ = e. 


2.3.4 Lemma The inequality 
1\” l n+l 
(1+-) <e<(i+7) : 
n n 


Proof By the AM-GM inequality we infer easily that the inverse of the sequence 


((1 +h 1") is strictly increasing (and thus the sequence itself is strictly decreas- 
n 


n 
iT n+1 1 n+1 
(1+2) > lim (1+-) =e. 
n n—-> oo n 


It was noticed by I. Schur that the sequence ay(n) = (1 + i 


holds for every n > 1. 


ing). Therefore, 


n+a . * 
) is decreasing 


ifaeé [5, oo), and increasing forn > N(qa) ifa € (—o@, 1/2). See Exercise 5, 
Sect. 8.2. 


2.3.5 Remark The sequence (b,), approaches e quite fast. In fact, 


1 1 1 
o<e-(1+ 4-42) < (2.1) 


for every n > 1. This can be argued as follows: 


1 1 1 
=by= li ee ee 
oon lim (Sa tas t +) 


1 1 i l 
= on ( +oatappesat tapes) 


1 1 1 
Cp eae 
a. Lo n+2 
“tapi. t= ~ @+tDint 1 
o 1 n(n +2) 1 


~ alen (n +1) ni-n 


From the inequality (2.1) it follows that e is an irrational number. In fact, assuming 
that e = p/q, with p,q € N*, we are led to 


Pp 1 1 1 
0< 1+—+:-: me ‘ 
q 1! q) q'q 
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By multiplying both sides by g!, we infer that between 0 and 1/g would exist an 
integer number, which is false. The number e is transcendent. Its value with ten exact 
decimals is e = 2.7182818285.... 


Exercises 


1. An immediate consequence of Lemma 2.3.4 is that 
: log(n+ 1) —1 : 
oie Dem < 


Use this fact to prove that the sequence c, = 1+ 5 freee i — log n is decreasing 
and bounded (and thus it is convergent). 


Remark The limit of this sequence, usually denoted y, is known as Euler’s con- 
stant; we have y = 0.57721... . The problem whether y is rational or irrational 
is still open! 

2. (a) Infer from the previous exercise that 


1 1 
li ---+— } =log?. 
Peas (+ +7) 08 


(b) Use induction (or a summation trick) to prove the Botez—Catalan identity, 


‘ 1 + 1 i. 1 : _1 
2 InM—-1 Wn n+l 2n’ 


and conclude that 


; ial a 
lim { 1 + seep = log 2. 
noo 2 3 n 


3. It is easy to see that the sequence 


1 1 n 
Xn = aaa a 


is convergent to e. Use some values of n to see that this sequence gives by far 
better approximations of e then both the sequences 


1 n 1 n+1 
an = (1 + ~) and b, = (: + ~) . 
n n 


What is the reason behind this fact? 
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2.4 The Cauchy Completeness of IR 


The Bolzano-Weierstrass Theorem yields a very important theoretical criterion of 
convergence, asserting the equivalence between the property of the terms of being 
close to a certain point and that of being close to each other. 


2.4.1 Definition A sequence (a,), of real numbers is called a Cauchy sequence (or 
a fundamental sequence) if for every ¢ > O, there exists an index N such that for 
allm,n > N, we have |am — an| < &. 


The following three results are immediate: 
2.4.2 Lemma Every convergent sequence is a Cauchy sequence. 
2.4.3 Lemma Every Cauchy sequence is bounded. 


2.4.4 Lemma A Cauchy sequence that contains a convergent subsequence is con- 
vergent. 


Proof Let (dn)n be a Cauchy sequence with az, — €. We show that a, — £. 

Let ¢ > 0. Since (dy) is a Cauchy sequence, there exists an index N; such 
that for all m,n > Ni, we have |am —an| < €/2. Since az, — &, there exists 
an index N> such that for all n > No, we have |ak, _ e| < e/2. Then, for all 
n > N = max {Nj, N2}, we get 


lan — €| < lan — ax, | + lax, — €| < e/2+e/2 =, 


which ends the proof. 


The three lemmas above can be combined to yield an important theoretical crite- 
rion of convergence. 


2.4.5 Theorem (Cauchy’s Criterion) A sequence (an)n of real numbers is conver- 
gent if and only if it is a Cauchy sequence. 


Proof If (ady)n is a Cauchy sequence, then it is bounded (see Lemma2.4.3) and 
so, by Bolzano-Weierstrass Theorem, it has a convergent subsequence (ax,,)n. By 
Lemma 2.4.4, the whole sequence (a@,), must be convergent. 

The converse is obvious and was stated as Lemma2.4.2 above. 


Exercises 


1. Prove the convergence of the sequence given by the formula 


’ 


—N Ni Nn 
a Og os 


where N, N2, N3,... are arbitrary numbers in {0, 1, ..., 9}. Notice the connec- 
tion of this fact with the decimal representation of real numbers. 
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2. Prove that the sequence of general term 


is a Cauchy sequence. 


2.5 The Extended Real Number System 


We add to R two symbols, —oo (minus infinity) and oo (plus infinity), and extend 
the original order in R as follows: 


—oo <x forallxeR 
x <oo forallxeR 


—-CO <M. 


The set R= RU {—oo, 00} together with the above order relation is called the 
extended real number system. The main feature of R is that every nonempty subset 
A of R admits simultaneously a least upper bound and a largest lower bound (in R). 
If A is a subset of R bounded above, then the existence of sup A follows from 
Completeness Axiom, while for those subsets that are not bounded above, necessarily 
sup A = oo. Thus, sup A = o represents an alternative way to outline that a subset 
A of R is not bounded above in R. In a similar manner, one can discuss the case of 
subsets bounded below. 

It is useful to extend the algebraic structure of R, by defining some operations 
with infinite elements. More precisely, the addition is supplemented by 


x + (—oo) = (—o0o) + x = —o0 forallx eR 


xtoo=oo+tx=oo forallxeR 


(—0oo) + (—coo) = —0c© and W+W=H, 


while multiplication is supplemented by 


o ifxeR,x>0 
ree be: ifx eR, x <0 
~ ~)] oo ifxeR, x>0. 


All these operations are motivated by their companions in terms of limits. See 
Exercise 2. 


54 2 Limits of Real Sequences 
We do not define 
co— oo, (—o)+co, 0-(-—~), (—~w)-0, 0-~, w-0 


nor 
1°, oo? and 0°. 


These expressions are known as “indeterminates”. See Exercise 7 for an expla- 
nation. In Chap. 11, devoted to Lebesgue integral, the operations 0 - 00, ~ - 0, 
0 - (—oo) and (—o0o) - 0 will be considered legitimate (and equal to 0), under certain 
special circumstances. 

Following the model of bounded intervals (described in Sect. 1.2), one can con- 
sider intervals of the form 


La, b], (a, b], [a, b) and (a, bd), 
with a,b € R, a <b. For example, 
[—co, b) = {x eER:-c<x <b}, (—oo, 00) = R and [—o0, oo] = R.. 


An interval is said to be nondegenerate if it is nonempty and does not reduce to 
a single point. The intervals included in R are called real intervals. 
A subset J of R is called convex if 


x,y €Jandte€[0,1] implies (1—f‘)x+tyel. 


A combination of the form (1—t)x + ty witht € [0, 1]is called a convex combination 
of x and y. A convex set contains, together with any pair of points, all their convex 
combinations. 


2.5.1 Proposition The real intervals and the convex subsets of R are the same. 


Proof Clearly, every real interval is a convex set. Conversely, let J be a nonempty 
convex subset of R and put a = inf J andb = sup/.Ifa = b, then J = {a} = [a, a] 
and the proof is done. If -oo < a < b < ov, then for every ¢ € (0, Pa), there are 
points a, € [a,a+eée)M J and by € (b — €, b] NT, which implies 


lag, be] C I, 


due to the property of convexity of J. As a consequence, J must contain 


L)  [ac, bel = (a,b) 


e€(0, 254) 


and thus J is one of the intervals (a, b), [a, b), (a, b] and [a, b]. The other cases, 
when a and/or b belong to {—o, oo} can be treated similarly. 
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In Sect. 2.1 we introduced the notion of convergent sequence by using the absolute 
value function to measure how close are the terms a, to the limit £. However, this 
function can be avoided by noticing that the concentration of the terms near the limit 
admits an equivalent formulation based on some special intervals around the limit, 
the so-called e-neighborhoods of £ : 


I,(€) = (€—¢,€+ 8). 


2.5.2 Proposition a, — ¢ in R if and only if for every ¢ > 0, there is a natural 
number N such that for alln = N, 


An € I,(€). 
In the case of infinite elements, it is natural to define the e-neighborhoods as 
I,(—o0) = [—00, —€) and I,(00) = (¢, co] 


and to take the condition in Proposition 2.5.2 as the definition for sequences with 
infinite limits. This leads to the following definitions. 

A sequence (a;,), of real numbers has the limit —oo (equivalently, a, —- —oo or 
jim, an = —0o) if for every ¢ > 0, there is an index N such that a, < —e for all 
n>QN. 

A sequence (a,), of real numbers has the limit oo (equivalently, a, — oo or 
fim, an = 0) if for every ¢ > 0, there is an index N such that a, > ¢ foralln > N. 

It is worth noticing that unlike the case of convergent sequences, here of interest 
are the big values of e. 

In this way, every decreasing sequence of real numbers, that is not bounded below, 
has the limit —oo, and every increasing sequence that is not bounded above, has the 
limit oo. In particular, the sequence of natural numbers has the limit oo. 

Taking into account the Monotone Convergence Theorem, we infer that all 
monotone sequences of real numbers have limit (finite or infinite). Thus, Bolzano- 
Weierstrass Theorem can be extended in the following form: Every sequence of real 
numbers contains a subsequence with limit (finite or infinite). 


Exercises 


1. Prove that: 
(a) lim logn =oco; and(b) lim a” =ooforalla > 1. 
n—->oo noo 


2. (The extension of algebraic operations with sequences). Let (ay) _and (by)n be 
two sequences of real numbers such that a, > € and b, — ¢' in R. Prove that 
Qn + by > €4+¢' and aynby, — €¢’ as long as the operations with @ and €’ make 
sense. 
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10. 


. Let (d;)n be a sequence of positive numbers such that lim 
n—- oo 


2 Limits of Real Sequences 


. Suppose that (a;,)n 1s a sequence of real numbers bounded below and (by), is 


a sequence of real numbers with limit oo. Prove that (a, + by), also has the 
limit oo. 


. Let (a))n be a sequence with infinite limit. Prove that: 


(a) there exists a natural number N such that a, ~ 0 for alln > N. 
(b) lim + =0. 
n—> oo 


an 


. Suppose that (a,,), is a sequence of positive numbers convergent to 0. Prove that 


lim + = oo. What is happening if the hypothesis on positivity is dropped? 


noo @n 


. Let P(x) = agx™ + ayxN-!1 +.--- + ay bea polynomial function with real 


coefficients, of degree N > 1. Prove that 


lim P(n) = (sgn dg) - 0. 
n—->oo 


. (a) (An explanation for not defining 2). Prove, with examples, that for every 


£ € [0, oo], there exist pairs of sequences (a,,)y and (by), with limit oo such 
that Ee => é. 


(b) Explain why not defining o co — co and 1°. 
“tl — ¢ exists. 
Prove that: 
(a) if 2 <1, then lim a, = 0; 
n—-> Oo 
(b) if€ > 1, then lim a, = ~; 
N—- Oo 
(c) the case £ = | is not conclusive, due to the existence of sequences without 
limit. Give an example. 


. (Comparing the order of convergence). Prove that 


for all polynomial functions P with real coefficients and all number a > 1. 
[Hint: Use the above exercise. ] 

Suppose that a sequence of real numbers is divergent. Prove that either it has a 
limit belonging to R or it contains two subsequences with different limits in R. 
[Hint: Negate the property of being a Cauchy sequence and then apply the 
extension of Bolzano-Weierstrass Theorem to R.] 


2.6 Limit Inferior and Limit Superior of a Sequence 


The limit inferior and limit superior of a sequence (dp )n of real numbers are defined 
(as elements of R), respectively, by the formulas 
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liminf a, = lim (jnt as) 


noo noo \ k=n 


lim supa, = lim (sup a). 


n= oo n> 00 \ sn 


Since the sequence (inf ag), is increasing and the sequence (sup ax), is 
k>n k>n 
decreasing, 


lim inf a, = sup ( inf as) and limsupa, = inf ( sup as). 


noo n>0 \ ken n—> oo n20\ ken 


Clearly, 


—oo < inf a, < liminf a, < limsupad, < supa, < ov. 
n>=0 n> © noo n>0 


For the sequence of general term a, = ‘, we have 


1 
lim inf a, = sup ( inf :) = sup 0=0 
n—>oo n>1 k>n n>1 
and 
‘ : 1 bg tt 
lim supa, = inf { sup — } = inf — =0. 


n> oo n=1\ k>n 


In general, whenever the ordinary limit exists, the limit inferior and limit superior 
are both equal to it. Therefore, each can be considered a generalization of the ordinary 
limit. 

2.6.1 Theorem Let (a,), be a sequence of real numbers. 


(a) If the sequence (ay)n is convergent to a number £, then 


lim inf a, = lim supa, = €. 
=> CO: n—>oo 


(b) If there is a real number £ such that the equality above holds, then the sequence 
(an)n is convergent to €. 


Proof (a) Suppose that a, — @ in R. Then, for any ¢ > 0, there is a natural number 
N such that for alln > N, we have € — € < a, < €+ . Thus, 


£—e < infa < supa <fl+e 
k>n k>n 


for alln > N, from where, by taking the limit, we obtain that 


£—e <liminfa, < limsupa, < +6. 
R=? 00 n> oo 
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Since ¢ > 0 was arbitrarily fixed, this implies 


lim inf a, = lim supa, = €. 
n— oo n—->oo 


(b) Suppose that lim inf a, = lim supa, = @ € R. Clearly, 
noo 


n—-> Oo 


inf a, <a, < supa, foralln EN. 
2n k>n 


The convergence of (a;,), to £ is now a consequence of the Squeeze Theorem. 


For the sequence a, = (1 + (—1)”) /2, we have 


liminfa, =O and limsupa, = 1 
T= OO noo 


and its subsequences may converge to 0 or to 1. This example illustrates a general 
phenomenon. 


2.6.2 Theorem Let (ay), be a sequence of real numbers. Then: 


lim inf a, = inf {e ER: there is (ky)n such that ax, > e} 
n—>o 


and 


lim sup a, = sup {e ER: there is (kn)n such that ax,, > e} : 


no 


The proof of Theorem 2.6.2 makes the objective of Exercise 3. 


Exercises 


1. Find the limits inferior and superior to the sequence of general term 


(1+ (-D"/7!) logn 
log 2n , 


an = 


2. Prove (by contradiction) that: 
(a) if liminf a, > £, then at most finitely many terms a, are less than ¢. 
n—->oo 


(b) if limsupa, < £, then at most finitely many terms a, are bigger than the 


1S) 


. Prove Theorem 2.6.2 by using the previous exercise. 

4. Suppose that (a,), and (b,), are two sequences of real numbers such that 
lim a, = £ exists in R. Prove that: 

n— Oo 


(a) lim sup (ay + by) = € + lim sup Dy; 
n> oo n—-> oo 
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(b) lim sup (a,b,) = Clim sup b, provided that £ > 0. 


n> oo neo 
(c) The assertions (a) and (b) also work when limit superior is replaced by limit 
inferior. 
5. Let (ay), be a sequence of positive numbers. Prove that 


Qn+1 


~  . pant 8 ‘ F 
lim inf < liminf a, < lim sup’/a, < lim sup 
noo an NOOO n—>0o n>oo an 


and conclude that if the limit lim “+! = £ exists, then the limit lim 7/a, also 
noo n—>0o 


exists and equals @. 
6. Let (an)n be a sequence of real numbers such that a4 < m+ 4p for all indices 
m and n. Prove that 


lim — =inf —. 
n—>wo n n>1 Nn 


7. Let (an)n be a sequence of positive numbers having limit in R and let p be a 
natural number. Prove that 


1 n 
lim sup (“=2) >e?. 


n—>0o an 


Find a sequence for which equality occurs. 


2.7 The Stolz—Cesaro Theorem 


In connection with the operations with sequences having limits (finite or infinite) we 
ran into the problem of “eliminating the indeterminates”. We know that if a, > € 
and b, — £' (with @’ 4 0), we have 


an L 


— lS -. 
by Lf 


It may happen that a, — Oandb, — 0, but the sequence Goa is still convergent! 
For example, this is the case when dy = by = 1. 


The general cases when “the elimination of indeterminates” is possible make 
the object of the so-called Stolz—Cesaro theorems (that are nothing but the discrete 


analogous of Bernoulli-L’ H6pital Rules in Sect. 8.3). 


2.7.1 Theorem (Stolz—Cesaro Theorem, the case 3) Let (ay)n and (bn)n be two 
sequences of real numbers convergent to 0, such that (bn) y is strictly monotone and 
lim “1% _ ¢ (possibly in R). 


n>0o Dnt, — dy 
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fee. a an : 
Then the limit lim_ — also exists and equals £. 
n>oo by 


Proof Clearly, it suffices to consider only the case when (by), is strictly decreasing. 


Case 1: lim ica aes £ € R. Then for any ¢ > 0, there is an index N such 
n> bys — dy 
that 
Qn+1 — Gn 
£—€ < ———— <f+e 
Dn+1 _ Dn 


for alln > N. By the hypothesis, by41 — by < 0 for all n, so that 
(£— 6) (ba+i — bn) > An+1 — an > (€+ 6) (bn4t — bn) 


for all n => N. For a fixed such number n, we write down the inequalities corre- 
sponding ton,n+ 1,...,n-+ p, and we add them side by side. We get 


(€—e) (Dneg — bn) > Ant+p — an > (€+ €) (On+p = bn) . 
Taking the limit as p — oo, we obtain 


(€ — &) (—bn) = —an = (E+ €) (—bn) 


from which we conclude that € — ¢ < i <é+eforalln>N. 


: Qn+1 — an 
Case 2: im ———— 


= oo. Then for any e > 0, there is an index N such that 
noo Dy — 


n 


Qn+1 — An 
———— >6é 
bnti — bn 
for all n > N. As a consequence, we get 
m=—1 m 
On — Am = > (ak = ae4.1) > & > (be — Bet) 
k=n k=n 


= &(by — bm) 


an bin an 
— >efl—-— ]4+—. 
by ( by ) by 


Keeping n fixed and taking the limit as m — oo, we obtain that i > e (for all 


for allm > n> N, and thus 


n > N). Consequently, lim a =o. 
n—oo Yn 
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Gn+1 — An 


Case 3: lim = —oo. This case is similar to the second. 


noo byiy — 


n 


In a similar way, one can prove the following result. 


2.7.2 Theorem (Stolz—Cesaro Theorem, the case of sequences with infinite limit) 
Let (an)n and (byn)y be two sequences of real numbers such that (bn)y is strictly 
increasing to oo and 

Gn+1 — an 


lim —*—* = @ (possibly in R). 


noo by+1 — by 


Then the limit lim a also exists and equals €. 
n—>CoO nn 
The Stolz—Cesaro Theorem (in its different variants) has numerous applications. 
We indicate here two examples. The first example shows that the arithmetic (as well 
as the geometric) mean of terms of a convergent sequence, also converges (and to 
the same limit). 


2.7.3 Corollary (A.-L. Cauchy) (a) Ifa, — £ in R, then 


at-:::+ay 
ns 
n 


(b) [fan —> €in R and all terms an are positive, then 


(ay +-ay)'" — €. 

In connection with Corollary 2.7.3, let us call a numerical sequence (ay), Cesaro 
convergent to ¢ if es — £. While convergence implies Cesaro convergence, 
the converse is not true. See the case of the sequence ((—1)”),,. However, we can 
relate the Cesaro convergence with the convergence of some subsequences. This 
makes the objective of Theorem 2.8.1. 

A second application of the Stolz—Cesaro Theorem refers to the rate of conver- 
gence of a sequence. We know that 


1 1 
1+ =+---+--logn—- y, 
2 n 


from where it follows that ay feet oa — log2. Theorem 2.7.1 allows us to 


precise this conclusion as follows: 


1 1 
li ee = op) ba = 4, 
jim »(5+ oe 082) / 
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Exercises 


1. Prove that if a, + and ¢ # 0 then -—"_- > €. 
a 


an 


2. Fora € (0, 1), we define the recurrent sequence (x,), as follows: 
Xo =a and X41) =X,C1 — xp) forn => 0. 


(a) Prove that the sequence (x,,), converges to 0. 
(b) Prove that nx, — 1. 
3. Infer from Corollary 2.7.3 that every sequence (a;,), of positive numbers for which 


lim “+! = £, verifies also lim 2/a, = @. As an application, show that 
n>oo & n—>0o 


: : vn! 1 
lim VYn=1 and lim —=-. 
n—->oo now n e 


n 


4. Show by example that the converse of Theorem 2.7.2 may fail. Then prove that 
if (a,), and (b,), are two sequences of real numbers such that (b,), is strictly 


increasing to oo and lim et = be R\{I}, then 
noo “Nn 
‘ a ; ; . an+1 —4 
lim = € implies lim “+ —“ =¢. 


noo by, noo bn+1 — bn = 


2.8 Notes and Remarks 


The rigorous definition of limit was given by Bernard Bolzano (Der binomische 
Lehrsatz, Prague, 1816, work little noticed at the time), and by Weierstrass in his 
lectures at Berlin University. 

A nice account on the algorithms provided by iterative sequences (including the 
Babylonian algorithm) is given by Bailey [1]. A method for finding an approximation 
to a square root equivalent to two iterations of the Babylonian algorithm at each 
step is described in an ancient Indian mathematical manuscript called the Bakhshali 
manuscript. Given a number a > 0, one considers N”, the largest perfect square 
less than a and ./a is approximated by the iterative sequence 


xo =N and xn41) =X, + 


a—x2 
ae 2 (x ay =) 


In the case of a = 336009, this sequence starts with x9 = 579. The first iterate 
x1 =579.662 833 033 259... represents an approximation of 
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V 336009 = 579. 662 833 033 134... 


with 12 significant digits! See the paper by Bailey and Borwein [2] for details. 

In Exercise 3, Sect.2.2, we touched the subject of continued fractions. The simple 
continued fractions algorithm allows us to represent each real number x by a (finite 
or infinite) string of the form 


x = ag + ———— = [a9} a1, 42, ...], (2.2) 
a. Ie 
az + — 


where ag € Z and all other coefficients a, are strictly positive integers. This is 
done inductively as follows. In the first step, we choose ap = [x]. If {x} = 0, the 
algorithm stops. If {x} > 0, then we pass to the next step. We have x = ao + {x} 
and x; = ay > 1. We choose a, = |x;]. This yields xj = a, + {x;} and 


x = ag + ———_.. 
ay + {x1} 


If {x,} = 0, the algorithm stops. If {x,;} > 0, the algorithm continues with the 
decomposition of x. = Tay > 1 and so on. Clearly, this algorithm produces an 
infinite continued fraction if and only if x is an irrational number. 

A good introduction to the theory of continued fractions can be found in the books 
of Khinchin [3] and Lang [4]. It is proved that every irrational number x is the limit 
of the sequence of fractions 


P 1 
EY = [40; a1, 42,...,4n] =ao+ 1 : 
dn 


ay+ 


called convergents and 


1 | Pn 
< |X 
dn (dn a Gn+1) dn 


~~ GnGn+l 

for all n € N. Moreover, the fractions a provide the best approximation of x by 
n 

rationals in the sense that for any other fraction > with 0 < s < qn, we have 


lsx —r| > |gdnx — Pn\- 
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The golden ratio g = is has the continued fraction expansion [1; 1, 1, 1,...]. 
See Exercise 3, Sect.2.2. Since none of its coefficients is greater than 1, g is one of 
the most “difficult” real numbers to approximate with rational numbers. Indeed, a 
result due to Adolf Hurwitz asserts that any real number x can be approximated by 


infinitely many fractions p/q such that |x = 2| < TE The case of g shows that 


L 
one cannot change \/5 by any other greater constant. 

The recent handbook by Cuyt et al. [5] offers a nice account on the present state 
of art of continued fractions and their applications. 

The result of Corollary 2.7.3, concerning the arithmetic mean of terms of a con- 
vergent sequence can be considerably improved using the notion of convergence in 
density. 

The main idea is to eliminate a negligible part of the indices and to take into 
consideration the rest of the terms. Precisely, if J C N is a subset such that N\ J is 
infinite, one may define in R limits of the form 


having the meaning that for any ¢ > 0, we can choose an index N for which 
ldn — €| < €, whenevern > N,n € J. 
Call a subset J C Na set of zero density if lim '/S9=/="Il — 9, Clearly, the 


n— CO 
complementary set of a set of zero density is infinite. 


A numerical sequence (a, ), converges to £ in density if there is a set of zero density 


J CNsuchthat lim a, = ¢. The main feature of this type of convergence is 
n>oo,n¢J 


its connection with the convergence in the sense of Cesaro: 


2.8.1 Theorem (Bernard O. Koopman and John von Neumann) Suppose that (dy)n 
is a bounded sequence of positive numbers. Then the following two conditions are 
equivalent: 

. 1 n—-1 
(a) pau n a ak = 0; 
(b) a, — 0 in density. 


n 
Proof Assuming lim i > ax = 0, we associate to each ¢ > 0 the set Ag = 
n—->oo k=1 


{n EN: ay > €}. Since 


Hl, ...,mpAel 


as in — oo, we infer that each of the sets A, has zero density. Therefore, a, — O in 
density. 
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Suppose now that (a,,), is bounded and converges to 0 in density. Then for every 
€ > 0, there is a set J of zero density outside which a, < ¢. Since 


ee 1 1 I{1,...,n}nJ| 
= ee “se Se eee 
nat Fs > ae > an < sup ay + € 


n 5 
ke{1,...,.n}0J ke{1,....n}\J keN 


n 
and lim Hi OJI _ 0, we conclude that lim i > a, = 0. 
k=1 


noo nu noo! 


An application of Theorem 2.8.1 to the convergence of series makes the objec- 
tive of Exercise 4, Sect.4.2. Theorem 2.8.1 was recently extended by Niculescu and 
Popovici [6, 7] to higher order densities, in particular to the harmonic density, 


where x4 represents the characteristic function of A. 

See also the Notes and Remarks at the end of Chap. 4. 

Cesaro convergence extends the concept of limit beyond the class of conver- 
gent sequences, but not to all bounded sequences. This was accomplished by Stefan 
Banach. Using Axiom of Choice, he proved the existence of generalized limits, that 
is, of the mappings LIM : £—(N) — R having the following four properties: 


(LIM1) If (xn)n is a convergent sequence, then LIM((xn)n) = limp—+oo Xn; 

(LIM2) (Linearity): LIM(a(Xn)n + BQn)n) = @ LIM((Xn)n) +b LIM((yn)n) for all 
numbers a, b € R and all bounded sequences (xn) and (yn)n ; 

(LIM3) (Positivity): If (%,)y => 0, then LIM((xn)n) = 0; 

(LIMA4) (Invariance): The limit of a sequence (x,), is the same as the limit of its 
translate to the left, (%,+1)n- 


The generalized limits are not unique. Necessarily, they verify the double 
inequality 


lim inf x, < LIM((%)n) < lim sup xp. 
oe noo 


Details can be found in the book of Bhatia [8], pp. 34-35. 
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Chapter 3 
The Euclidean Spaces R? and C 


Though in this book we are primarily interested in the properties of functions of a 
real variable, it is useful to know some concepts and facts in a greater generality. 
This not only gives a better perspective of most of the results treated here, but also 
sheds some light on the way mathematics itself evolved. 

The fact that even simple algebraic equations such as x* + 1 = Ohave no solutions 
in R requires us to enlarge the real field to a field that makes soluble all algebraic 
equations with real coefficients. The answer proved to be the complex field C, whose 
presentation makes the object of this chapter. It can be endowed naturally with an 
Euclidean structure, which is responsible for the entire metric geometry in the real 
plane. (Pythagorean Theorem, Parallelogram’s Law, Hlawka’s identity etc.) 

The convergence of sequences of complex numbers is equivalent to the simultane- 
ous convergence of the sequences of its real and imaginary components. As a straight- 
forward consequence, a number of basic results such as the Bolzano-Weierstrass 
Theorem and Cauchy’s Criterion are still valid in this context. 


3.1 p-Dimensional Space 


The Cartesian product R? = Rx Rx--- x R(p = 2) is the set of all p-tuples 
ee! 
p factors 
X = (%1,.%2,...,Xp) of real numbers; the elements x), x2,..., x,» are the compo- 
nents of x, and the functions 


pr, :R? > R, pyr @w=xm UI<k<p) 


are the coordinate projections. 

R? carries a natural structure of linear space, defined coordinatewise, of which 
the reader is familiar from the linear algebra course. If x and y and are in R? and 
a €R, then 


© Springer India 2014 67 
A.D.R. Choudary and C.P. Niculescu, Real Analysis on Intervals, 
DOI 10.1007/978-8 1-322-2148-7_3 


68 3 The Euclidean Spaces R? and C 


X+Y = (X11, %2,...,Xp) + (1, Y2,---5 Vp) 
= (x1 + yi, ¥2 + y2,.-.,Xp + Yp) 


and 
OX = A(X], X2,-..,Xp) = (AX, AX2,..., AX p). 


We will refer to R? as the real p-dimensional space, when endowed with this 
structure. In this context, the real numbers will be called scalars, while the elements 
of R? will be called vectors. The null element of R? (also called the origin) is the 
vector 0, with all components 0. 

The vectors 


ey = (1,0,...,0) 
er = (0, 1,...,0) 


e, = (0,0,..., 1) 


constitute the natural basis of R’”. Every vector x can be uniquely represented as a 
linear combination of the elements of this basis: 


Pp 
x= > XKek. 
k=1 


3.1.1 Remark The addition in R* corresponds to the parallelogram law of addition 
of planar forces in Mechanics. Indeed, let us associate to the point x = (x1, x2) the 
vector OM, where M is the point of coordinates x,, x2. If OM; corresponds to x 
and OM> corresponds to y =(y1, y2), then OP corresponds to x + y, where P is the 
fourth vertex of the parallelogram with the other three vertices in O, M, and M2. 
See Fig. 3.1. 


Fig. 3.1 The addition of vectors 
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The linear structure of R? is enriched by the presence of the Euclidean scalar prod- 
uct (or inner product), which associates to each pair of vectors X = (x1, X2,..., Xp) 
and y = (y1, y2,---, Yp), the number 


Pp 
(x.y) = Do xeye. (3.1) 
k=1 


The following properties of the Euclidean scalar product are immediate: 


(SPI) (x, x) > 0; (x, x) = Oif and only ifx = 0 
(SP2) (x, y) = (y, x) 
(SP3) (ax + By, z) = a(x, Z) + B(y, Z) 


for every x, y,z € R? andeverya,B ER. 
The number 


p 1/2 
IIxl = V(x, x) = (>) 
k=1 


is called the Euclidean norm (or the length) of x. The basic properties of the Euclidean 
norm extend those of the absolute value, as presented in Sect. 1.2. 


(NORM1) ||x|| = 0; 

(NORM2) ||x|| = 0 if and only if x = 0; 
(NORM3) |lax|| = |@| - [Ixll; 
(NORMA) ||x+ yl] < IIxll + llyll- 


The nontrivial property is (NORM4), also known as the Minkowski (or triangle) 
inequality. Using coordinates, this inequality reads as follows: 


(Sim +0") <(>*) +(>>) | 
k=1 k=1 k=1 


The proof needs the Cauchy-Buniakovski-Schwarz inequality: 


3.1.2 Theorem (Cauchy-Buniakovski-Schwarz Inequality) For every pair of 
vectors x and y inR?, 
I(x, y)| < IIxll Ilyll. 


The equality holds if and only if x and y are linearly dependent. 


Proof We will present here a proof based entirely on the definition of Euclidean norm 
and the properties (SP1)—(SP3) of the scalar product. Another proof was sketched in 
Exercise 6, at the end of Sect. 1.2. 
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If |ly|| = 0, then y = 0 and the conclusion follows. If ||y|| 4 0, then 


0 < (x+ty,x+ ty) = (x, x) + 2r(x, y) + ly, y) 
= |[xll? + 2¢(x, y) + #? llyll? 


(x,y) 


for all t € R. In particular, for t = — iit’ We obtain 
sy)? lex yl? lx, y) I? 
Oe? ere 
llyll llyll llyll 


and the proof of the inequality is done. 
It is clear that the above computation yields a strict inequality unless x + ty = 0. 

If ax = fy for suitable numbers not both 0, then the inequality becomes 

an equality. 


Taking into account the Cauchy-Buniakovski-Schwarz inequality, we obtain the 
property (NORM4) as follows: 


Ix + yl? = +ye+y) 
= IIx? +20, y) + yl? 
< Wxll? +2 [ell yl + Uy? 
= (llxll + lly? 
The linear space R? endowed with the scalar product (3.1) is referred to as the 


p-dimensional Euclidean space. Its elements are also called points. The distance 
between two points x and y in R” is defined by the formula 


p 1/2 
d(x, y) = |Ix—yll = (So - 0?) 
k=1 


This agrees with the usual notion of distance in the real line, in the plane and also 
in the space. Several geometric consequences of the presence of Euclidean structure 
in R? are presented in the exercises below. A systematic treatment of the notion of 
distance is presented in Sect.5.1. 


Exercises 
1. (Equality in the triangle inequality). Prove that the equality 
IIx + yll = IIxll + llyll 
occurs in R? if and only if the points x and y belong to the same semi-line from 


the origin (that is, if there is tf > 0 such x = ty or y = fx). 
2. Infer from the equality ||x||? = (x, x) the parallelogram identity, 
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2 
Ix-+ yll? + IIx — yl? =2 (IIx? + II), 


which works for all x, y € R’. Geometrically it says that in every parallelogram, 
the sum of squares of its diagonals equals the sum of the squares of its sides. 
3. (Hlawka’s Identity). Prove that for every triplet x, y, z of vectors of R?, 


Ixll? + Hlyll? + lz? + x+y + zi? = Ilx+ yl? + lly + zl? + llz+xI. 
Then, infer from it the inequality 
xl + llyll + zl + x+y +z > Ix+yll + lly + zl + llz+x\). 


4. In the Euclidean space R?, one can introduce the notion of angle of any pair of 
nonzero vectors, x and y, via the formula 


cos(&¥) = oy) 


[Ixtl - llyll 
Accordingly, we say that two vectors x,y € R? are orthogonal (and denote 
x_Ly) if 
(x,y) =0. 
(a) (The Law of Cosine). Prove that 
IIx + yll? = Ibxl? + Ilyll? + 21IxI + Ilyll cosGy) 


for all nonzero vectors x, y € R?. 
(b) (Pythagorean Theorem). Prove that two vectors x, y € R? are orthogonal if 
and only if the following equality holds: 


Ix + yll* = IIxll* + lly’. 


3.2 Convergence and Completeness in R? 


The notions of bounded set, convergent sequence, limit, and Cauchy sequence can 
be extended verbatim from R to R? by using the Euclidean norm instead of the 
absolute value function. Each of them can be characterized coordinatewise due to 
the following result: 


3.2.1 Lemma The Euclidean norm of every point X = (x1, X2, ..., Xp) in R? verifies 
the double inequality: 


max |xx| < ||xll < /p- max |xx|. 
1<k<p l<k<p 


72 3 The Euclidean Spaces R? and C 
p 2 1/2 : a 2 
Proof Clearly, |xx| < ||x|| = (3 | for each index k, which gives us the 
left hand side inequality. For the right hand side inequality, notice that 
P V2 1/2 
ba <|p- max ae 
ei = I<k<p 


/p- max |xxl. 


I<k<p 


|x| 


A subset A of R? is called bounded if there exists a real number M > 0 such that 
for all x € A, we have 
|x|] < M 


According to Lemma 3.2.1, the bounded subsets of R? are precisely those subsets 
for which the components of their elements constitute a bounded set of real numbers. 
In particular, every rectangle [a, b] x [c, d] in R? is a bounded set. 

The definition of convergence of a sequence of points in R? can be also introduced 
following the model of R. 


3.2.2 Definition A sequence (x,,), of points in R? converges to a point x (abbrevi- 
ated, x, — x) if for every « > 0, there is an index N € N such that for all n > N, 
we have ||x, — x|| < e. 


The point x in Definition 3.2.2, if it exists, is unique. It is called the Jimit of the 

sequence (X,), and is denoted by 
lim Xp). 
nC 

A convergent sequence in R? is a sequence (X,)n for which there is x in R? such 
that x,, —> x. The sequences that are not convergent are said to be divergent. 

All properties mentioned in the previous chapter about convergent sequences of 
real numbers extend verbatim to the Euclidean context. So are the boundedness of 
a convergent sequence, the convergence of subsequences of a convergent sequence, 
the algebraic operations with convergent sequences etc. 

As in the real case, we can restrict ourselves to the case of sequences of positive 
numbers that converge to zero. In fact, 


X, > x ifandonly if |/x, —x|| — 0. 


This remark proves very efficient because many times we get the convergence 
Xn; — X via areal sequence (a,), convergent to 0, such that 


|x, —x|| <a, forall n. 


In R?, the convergence is equivalent with the coordinatewise convergence: 
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3.2.3 Proposition (a) [fx, — x in R?, then pr, (Kn) — pr; (x) in R for every index 
kin {1,2,..., p}. 

(b) Conversely, if pry (Xn) — xx in R for every index k € {1,2,..., p}, then 
Xn, — X, where X = (1, X2,...,Xp). 


Proof (a) Suppose that x, — x. Then for e > 0 arbitrarily fixed, there is an index NV 
such that ||x, — x|| < ¢ for alln > N. Taking into account Lemma 3.2.1, we infer 
that 


[pry Xn — pry x] < [Xn —xll <e, 


for alln > N andk ¢€ {1,2,..., p}. Therefore, pr, x, — pr, x for all k € 
{15 Qi esx Dis 

(b) Conversely, suppose that each coordinate sequence (pr; X7)n converges to 
a real number x,;, whenever k € {1,2,..., p}. Then, given ¢ > O and k in 
{1,2,..., p}, we may choose an index Nx such that for alln > Nz, 


pry (Xn) — pry (x)| < Fz 


Put N = max {M1, No,..., Np} and xX = (x1, X2,...,Xp). By Lemma 3.2.1, we 
infer that for all n > N, we have 


[Xn — x|| </p> max [pry (Xn) — pr, (x)| <6, 
l<k<n 


that is, x, — x. The proof is done. 


Following the model of R, a sequence (x;,), of points in R? is called a Cauchy 
sequence if for every ¢ > 0, there is an index N such that ||x;, — x,|| < ¢ for all 
m,n>WN. 


3.2.4 Theorem (Cauchy’s Criterion) A sequence (Xn )n of points in R? is convergent 
if and only if it is a Cauchy sequence. 


Proof Clearly, every convergent sequence is also a Cauchy sequence. For the 
converse, adapt the argument of Proposition 3.2.3(a) to prove that any Cauchy 
sequence (X,), in R? is Cauchy coordinatewise, that is, all sequences (pr; (Xn))n, 
fork € {1,..., p}, are Cauchy in R. Then, use the Cauchy completeness of R and 
Proposition 3.2.3(b). 


3.2.5 Remark IR? coincides as a linear space with the space F({1,2,..., p}, R), 
of all real-valued functions defined on the finite set {1,2,..., p}. According to 
Remark 1.2.9, R? is an example of linear lattice with respect to the ordering by 
components, 


x <y ifand only if x, < yg forall k. 
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This ordering is not total. For example, the vectors of the natural basis are not 
comparable to each others. However, it still has some nice features, the Monotone 
Convergence Theorem having an analogue in this context. See Exercise 4. 


Exercises 


1. If (K,)» is a sequence of points in R? with limit x, show that ||x, || > ||x|| . Is the 
converse true? 

2. Prove that x, — x in R? if and only if (x,, y) — (x, y) for every y €R?. 

3. Suppose that (x,,), and (y,), are two convergent sequences of points in R? with 
limits respectively x and y. Prove that (x;, Yn) — (x,y). 

4. Suppose that (x,,), is an increasing sequence of points in R? bounded above in 
order by a point y (that is, x, < y for all 7). Prove that (x,), is convergent to its 
least upper bound in R?. 

5. Prove that every point in R? is the limit of a sequence of points whose all coor- 
dinates are rational numbers. 


3.3 The Complex Field 


The 2-dimensional vector space R? has the nice feature that its additive structure 
(a1, b1) + (a2, bz) = (a1 + a2, bj + bp) 
can be completed by the following law of multiplication, 
(a1, b1) (a2, bz) = (aya — bi bz, ayb2 + azb1). 


One can prove easily that the multiplication is associative and commutative, and 
the pair (1, 0) plays the role of unit. Moreover, every nonzero element z = (a, b) 


has an inverse, precisely, 
1 a b 
7 az+b2?) a®@+b2)° 


The multiplication is also distributive with respect to addition, that is, 


Z(u+v) = zu+zv. 


Therefore, under the above operations of addition and multiplication, the set R2 
is a field, called the complex field. It is usually denoted C and its elements are called 
complex numbers. 
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Fig. 3.2. Representation of 


y 
complex numbers h 
Im zeeeeee hs 
+ . > 
0 Re z a 


The function 
g:R>C, g@=G,9), 
is an injective morphism of fields, and thus an isomorphism between R and g(R). 
This allows the identification of R with g(R) (in particular, of each real number a 
with the complex number (a, 0)). In this respect, the complex field is an extension of 


the real field. 
Denoting the vectors of the natural basis of R? by 


1=(,0) and i=(0,1), 
we obtain for the complex numbers z = (a, b) the representation 


z= (a,b) =a(1,0)+b(0, 1) (3.2) 


In this setting, the components a and b of z are called respectively the real part and 
the imaginary part of z. Usually they are denoted 


a=Re z and b=Im z. 

Given a complex number z = a + bi, the point (a, b) in the real plane that 
represents it is called the geometrical image of z, while in turn, z is the affix of the 
point (a, b). See Fig. 3.2. 

According to the definition of multiplication of two complex numbers, we get 


i? = (0, 1)- 0, 1) = (-1,0) = -1 


which shows that in C we can compute the square root of a negative number. Since 
the polynomial z? + 1 has the factorization 


2+1=(z+i)(z—i), 


it follows that the equation z? + 1 = 0 has the roots i and —i (and only these two). 
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In Sect.6.10 we will show that every nonconstant polynomial with complex 
coefficients can be factorized as a product of factors of degree one (and thus the 
number of its roots in C, counted with their multiplicities, equals the degree of the 
polynomial). This important result is known as the Fundamental Theorem of Algebra 
(or the D’ Alembert-Gauss Theorem). 

Using the notation z = a + bi for complex numbers, the operations in C can be 
reformulated in the more familiar way, 


(a+ bi) +(c+di)=a+c+(b+d)i 
(a+ bi)(c + di) =ac — bd + (ad + bc)i. 


By definition, the conjugate of a complex number z = a + bi is the complex 
number 
z=a-—bi, 


and the absolute value (or the modulus) of z = a + bi is the nonnegative number 
|z| = Va? + b?. 


Notice that the points of affixes z and z are symmetric with respect to the Ox axis 
and 


Many of the results concerning R extend to C. A notable exception is the ordering: 
does not exist on C any order relation < that verifies all Axioms 1.2.4—1.2.6 and 
induces on R the natural ordering. In fact, as we already noticed in Sect. 1.2, the Total 
Ordering Axioms 1.2.6 and 1.2.5 (AS/OS2) imply that squares are nonnegative. Or, 


P2120, 


Due to this fact, the essence of absolute value function on C is different from 
that of absolute value function on R (that was introduced using the property of total 
ordering of IR). The true link between them is provided by the concepts of norm 
and metric, already discussed in Sects. 3.1 and 3.2. That gives to |z| the meaning of 
distance from the origin to the point whose affix is z. 

The functions with values in C (respectively the functions defined on suitable 
subsets of C) are called complex functions (respectively functions of a complex 
variable). Their study lies at the core of Analysis. 

There is a strong connection between the real functions and the complex functions. 
Precisely, given a function f : A — C, we can attach to it several real valued 
functions such as the real part, the imaginary part, and the absolute value of f, 
given, respectively, by 


(Re f)(~) =Re f@), Gm f)@)=Im f(x), Ifl@)=If@I. 
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In the same way, we can define the conjugate of f by the formula 


foO=fM. 


These functions are related by similar relations already outlined in the case of 
numbers. Particularly, 


f =Re f+ilm f. 
Exercises 


1. Prove that ||z| — |w|| < |z + w| for every z, w € C. 
2. If z1,..., Z, are complex numbers, prove that 


Zp t-+++2al < lzil+---+ zal. 


3. Check the following properties of the conjugation function: 
(a) (z) = z and z = Z if and only if z is a real number. 
(b) For every z € C, - 7 

Re co its ren ark 


and 
|Re z| <|z| (with equality only for Im z = 0) 
|Im z| <|z| (with equality only for Re z = 0). 


(c) Zi #22 = 21 + 2, M22 = ZZ and (“) = £ forall 21, z2,u,vEC, v £0. 


4. Let z1, Z2,..., Z, be complex numbers whose absolute value is 1. Prove that the 
number u = a eee is real. 
a4 En 


[Hint: Prove that u = u, by noticing that z = ! if |z| = 1.] 
5. (The Schur product). Consider on R? the following operations: 


(a,b) + (c,d) =(at+c,b+d) and (a,b) * (c,d) = (ac, bd). 


Prove that (R?, +, *) is a commutative ring with unit but it is not a field. 
6. (The matrix representation of C). Let M be the set of all matrices of the form 


a b 
—ba 
with a,b € R. We endow M with the usual operations of addition and mul- 


tiplication of matrices. Prove that M is a field and the function g : C—> M 
given by 
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. a b 
gla t bi) = (S °) 


is an isomorphism of fields. 


3.4 Convergence and Completeness in C 


In the previous section, we introduced the absolute value of a complex number 
z=a+bi by the formula 


lz| = Va? + b?. 


This coincides with the Euclidean norm of z viewed as an element of R2. Therefore, 
the absolute value behaves exactly like the Euclidean norm and verifies the same set 
of properties: 


(NORM1) |z| => 0 

(NORM2) |z| = 0 if and only if z = 0 
(NORM3) |z1z2| = |z1] - |z2| 
(NORM&4) |z1 + 22] < |z1| + |z2I 


for all z, z1, 22 € C. 

The notions of bounded set, convergent sequence, limit, and Cauchy sequence in 
C coincide with their relatives in R*. For the convenience of the reader, we briefly 
describe them. 

A subset A of C is bounded if the absolute value of their elements is bounded 
above. That is, if there exists a number M > 0 such that for all z € A we have 


Iz| < M. 


Since 


max {|Re z|,|Im zl} <|z|=V/|Re z|?+ Im z|? < V2max{|Re zl, |Im z|} 


for every complex number z, a subset A of C is bounded if and only if the real parts 
and the imaginary parts of its elements form a bounded subset of R. 

A numerical sequence (Z;), converges to a number z (abbreviated, z, — Z) 
if for every « > O, there is an index N € N such that for all n > N, we have 
| Zn — z |< e. The number z is unique (with the above property) and is called the 
limit of the sequence (Zy,). We put 


z= lim Zp. 
noo 


3.4 Convergence and Completeness in C 79 


As in the case of sequences of points in R”, the change of finitely many terms in 
a sequence affects neither its convergence nor its limit. This remark is useful when 
dealing with limits of inverses of complex numbers. See Exercise 6. 

Noticing that z, — 0 in C is equivalent to |z,| — O in R, one can easily show 
that 


lim z” =O for every z with |z| < 1. 
noo 
See Example 2.2.2(b) for the case where z is real. The behavior of the powers of 
z € C when |z| = | differs much from the real case and is presented in Appendix A. 
In C, convergence is equivalent to coordinatewise convergence. This fact, which 
is a particular case of Proposition 3.2.3, can be detailed as follows: 


3.4.1 Proposition (a) If z, — zinC, then Re z, > Re zandIm z, — Im z 
inR. 

(b) Conversely, if Re z, — a and Im zy —> bin R, then z, — z, where 
z=at Di. 


Cauchy’s Criterion of convergence in C is a particular case of Theorem 3.2.4. 


3.4.2 Theorem (Cauchy’s Criterion of Convergence in C) A sequence (Zn)n of com- 
plex numbers is convergent if and only if it is a Cauchy sequence, that is, for every 
€ > 0, there is an index N such that for all m,n > N, 


l2m — Zn| < €. 


The Bolzano-Weierstrass Theorem also works in C. 


3.4.3 Theorem (The Bolzano-Weierstrass Theorem for sequences of complex num- 
bers) Every bounded sequence of complex numbers contains a convergent subse- 
quence. 


Proof Let (Zn)n be a bounded sequence of complex numbers. According to Lemma 
3.2.1, both sequences (Re Zn), and (Im Z,), are bounded. Then by the Bolzano- 
Weierstrass Theorem (for real sequences), there is a subsequence (Re Zx(n))n, Which 
is convergent, say to a. The same argument applied to (Im zx(n))n yields a subse- 
quence (Im 2/(k(n)))n» Which will converge to a number b. Finally, Proposition 3.4.1 
shows that Z(k(ny) > 2 = a+ bi. 


Having in hand the analogue of Bolzano-Weierstrass Theorem for sequences of 
complex numbers, one can derive Cauchy’s Criterion of Convergence in C as in 
Sect. 2.4. See Exercise 4. It is worth noticing that this theorem can be easily extended 
(with a similar argument) to the case of Euclidean spaces R”. See Exercise 10. 


Exercises 


1. Prove that every convergent sequence in C is also a Cauchy sequence and every 
Cauchy sequence is bounded. 
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. Suppose that z, — z in C. Prove that every subsequence of (Z,)y is also 


convergent to z. 


. Prove that every Cauchy sequence of complex numbers that contains a conver- 


gent sequence is itself convergent. 


. Deduce Cauchy’s Criterion of convergence in C from the complex version of 


the Bolzano-Weierstrass Theorem and the result of the precedent exercises. 


. Suppose that z, — z and w, — w in C. Prove that: 


(a) Zn) + Wp > Z+U; 
(b) ZnWpn > Zw. 


. Suppose that z, — zin C and z # 0. Prove that z, 4 0 for all but finitely many 


indices n, and — > ul 


. The signum function is defined on C by the formula 


a if z 40 
os aa eee 


Clearly, z = |z| sgn(z) and sgn(z1z2) = sgn(z1) sgn(z2) for all z, z1, z2 € C. In 
Sect. 7.4 we will present the trigonometric expression of the signum function. 
(a) Prove that if z, — z and z 4 0, then sgn(z,) > sgn(z). 

(b) Formulate (and prove) a converse of the precedent implication. 


. The nonzero values of the signum function belong to the unit circle, 


Sb={zeC:|z)/=1)}. 


Prove that S! is a commutative group with respect to complex multiplication. 


. Suppose that (z;,)n is a bounded sequence of complex numbers and (wy), is 


a sequence of complex numbers convergent to 0. Prove that (z,wW,), 1s also 
convergent to 0. 

(The Bolzano-Weierstrass Theorem for sequences of points in R’). Prove that 
every bounded sequence of points in R? contains a convergent subsequence. 
[Hint: Adapt the argument of Theorem 3.4.3.] 


3.5 Normed Linear Spaces 


The Euclidean spaces R? and C are the simplest examples of normed linear spaces. 


3.5.1 Definition A normed linear space is a linear space E (over the field K = R 
or C) on which there is given a function || - || : & — R (called a norm), which 
verifies the following four properties: 


(NORM1) ||x|| = 0 
(NORM2) ||x|| = 0 if and only if x = 0 
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(NORM3) |[ox|| = Jo - ||| 
(NORMA) ||x + yll S [lal] + IIyIl 


for every x, y € E andeverya € K. 

A normed linear space should be understood as a pair (E, || - ||), in general on 
the same linear space being possible to define several norms. See Exercise |. In the 
context of normed linear spaces, the notation R? (or C) will mean that the norm 
under attention is the Euclidean norm. 

A normed linear space is said to be real, respectively complex, according to its 
field of scalars, R or C. 

R? is areal normed linear space. C can be viewed as a normed linear space both 
on R and C. In the first case, C is a 2-dimensional real linear space, while in the 
second case it is a one-dimensional complex linear space. 

The notions of bounded set, convergent sequence, limit, and Cauchy sequence 
can be introduced mutatis mutandis following the model of Euclidean spaces. Most 
general results (such as the boundedness of a convergent sequence, the convergence of 
subsequences of a convergent sequence, and the algebraic operations with convergent 
sequences) also work in the setting of normed linear spaces. However, the Bolzano- 
Weierstrass Theorem and Cauchy’s Criterion of Convergence may fail in certain 
normed linear spaces. See Exercise 5. 

The normed linear spaces in which every Cauchy sequence is convergent are said 
to be complete, or Banach spaces. The discussion above shows that the Euclidean 
spaces IR? and C are Banach spaces. Other important examples of Banach spaces 
are mentioned in Exercises 3 and 4. 

The functions acting on normed linear spaces are usually called operators, while 
the functions defined on normed linear spaces and taking values in R or C are 
called functionals. The coordinate projections are the simplest examples of (linear) 
functionals on R?. A more advanced example is provided by the operation of taking 
limit, 


lim: c(N, R) > R. 


In Chap.8 we will discuss the functional of differentiation at a point, while 
the Chaps.9-11 are devoted to various functionals of integration. Linear alge- 
bra emphasizes the role of linear operators from R? into itself. They constitute a 
p X p-dimensional space denoted L(R?,R?). Other important examples are the 
operator of differentiation and the Fourier transform. 


Exercises 


1. Verify that the functions 
Pp 
x|l1 = xx| and ||X||oo = max |xx 
[Ix 2 kl [PXlloo = max. |x 


define norms on R? (different from the Euclidean norm when p > 1). 
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2. Prove that the following four linear spaces of sequences of real numbers, 
Fo(N, R), co(N, R), c(N, R), and &p (N) 
become normed linear spaces when endowed with the sup norm, 


Ian)nlleo = sup lan. 
neN 


3. Prove that the normed linear spaces co(N, R), c(N, IR), and €9(N) mentioned 
in the preceding exercise, are Banach spaces, but Fo(N, R) is not complete. 

4. Consider the complex variant of the spaces co(N, R), c(N, R), and eR (N) respec- 
tively the spaces of complex sequences co(N), c(N), and 2° (N). Prove that they 
are Banach spaces with respect to the sup norm. 

5. Prove that the Bolzano-Weierstrass Theorem fails in le (N). 


3.6 Notes and Remarks 


Our presentation of complex numbers as ordered pairs of real numbers follows the 
ideas of William Rowan Hamilton (1837). 

The complex numbers appeared in connection with the formula for the roots of 
cubic polynomials x* + px + q. This was published by Gerolamo Cardano in his 
treatise Ars Magna (1545). 

The interpretation of complex numbers as points in the real plane is due inde- 
pendently to Caspar Wessel (1799), Jean-Robert Argand (1806) and Carl Friedrich 
Gauss (1831). The term “complex number” is also due to Gauss (1831). In 1797 
Gauss gave the first substantial proof of the Fundamental Theorem of Algebra. 

R and R? are not the only finite dimensional linear spaces that admit a field 
structure. In fact, William Rowan Hamilton has discovered in 1843 that R* can 
be turned into a noncommutative field H, called the field of guaternions via the 
formulas 


(a, bi, C1, di) + (2, b2, C2, dz) = (ay + a2, b) + ba, C1 +02, di + dz) 
(a1, D1, C1, di) (a, b2, C2, dz) = (ajaz — by bz — cic2 — di dp, 
ayb2 + byaz + cj dn — dj c2, 
ajc — bydz + cya2 + dbo, aydy + byc2 — cyb2 + diaz). 


The real linear space R* has the linear basis consisting of the vectors 


1=(,0,0,0), i= (0,1,0,0), 7 = (0,0, 1,0), k = (0,0, 0, 1). 


Then i? = j? =k? =ijk = —1 and 
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ij=k, jk=i, ki=j 
ji=—k, kj = —i, ik=—j. 


A polynomial over the field of quaternions can have more distinct solutions than 
the degree of the polynomial. For example, the equation z* + 1 = 0 has infinitely 
many quaternion solutions, more precisely, all quaternions z = bi + cj + dk with 
b,c,d € Rand b? +c? +d? = 1. The field of quaternions is isomorphic to the field 
of matrices 


a bed 
—ba —-de 
ey aes :a,b,c,dEeRte, 
—-d-c ba 


from where it results that the quaternions can be viewed as pairs of complex numbers. 
The book of Nahin [1] offers a nice account of all these facts and much more. 


Reference 


1. Nahin, P.J.: An Imaginary Tale: The Story of —1. Princeton University Press, Princeton (1998) 


Chapter 4 
Numerical Series 


A suggestive definition (though not entirely accurate) for the notion of series, is that 
of a sum with infinitely many terms. As a matter of fact, we already touched the 
subject of series on several occasions: the decimal representation of real numbers 


and the representation of e ase = lim >°;_ a The goal of this chapter is to start 
noo . 


a systematic study of series, noticing that the main constants in analysis as well as 
many important functions are all defined as sums of suitable series. 


4.1 Convergent Series 


The notion of series was introduced to give a sense to the sums with infinitely many 
terms. 

A numerical series (indexed over N) is meant as a pair consisting of a sequence 
(an)n of complex numbers (called the sequence of terms) and a method of summation 
of it. The method of summation is the rule for computing the so-called partial sums. 

The principal method of summation that will be considered in our book is that of 
summation term by term, in which case the sequence (S,,), of partial sums is defined 
by the formula 


= a= agt-::-+ta,, neN. 
k=0 
In turn, the terms of the series are determined by the partial sums: 


ao = So, 


Gn = Sn —Sy-1 for n>1. 
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Usually, a series is denoted by one of the symbols 


DS an: be dgtaj+tact::: 


n>0 


Sometimes, we are led to consider a series indexed over other sets of integers such 
as 


{no, no + 1,n9 +2, ...}. 


Their theory is similar, and therefore we shall omit the details. 


4.1.1 Definition We say that a series >°,,.9 dn is convergent (and its sum is S) if 
the sequence (S,), of its partial sums is convergent to S. 


Usually, the sum is denoted by °°° 9 ay. In this way, 


CO 


n 
3 ad, = lim > ak. 
noo 
k=0 


n=0 


A series that is not convergent is said to be divergent. In the case of series of real 
numbers, we can encounter the case where the sequence of partial sums has the limit 
—oo or oo; such series are divergent, with sum —oo, respectively oo. 

The main problem in connection with a series is the computation of its sum 
(provided it is convergent). 


4.1.2 Lemma [/f a series >),.94n is convergent, then the sequence of its terms 
converges to zero. 


Proof In fact, dy = Sy — Sn-1 > S— S=0. 


The converse is not true. See the case of harmonic series, that makes the objective 
of Example 4.1.3(d). 


4.1.3 Example (a) The series 


1 1 1 1 
yy Get) 12 0.9 Sa 


n>0 
is convergent and its sum is 1. In fact, using the decomposition 


1 1 1 
(nt 1l)(n+2) n+l n+2’ 
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we obtain 


— t NA ed 4 4 
2464 DE +2) ~ [ BF \3--3 n+1 n+2 


A generalization of this example makes the object of Exercise 1. 
(b) The geometric series of ratio z € C is the series 


a =ltc4+24---; 
n>0 
the name comes from the fact that the sequence of its terms, 1, z, Zo... , is a geo- 
metric progression. 
For |z| > 1, the sequence of terms does not converge to 0, so, according to 
Lemma 4.1.2, the series is divergent. If |z| < 1, then 


So=l+zt--t2"= 


> 
1-z l-z 


and thus, in this case, the series is convergent and has the sum 1/(1 — z). 
(c) As follows from Sect. 2.3, the number e can be described as the sum of a 
convergent series: 


(d) The harmonic series is defined by 


ye 
nn 2-3 


n>1 


Its name comes from the fact that every term (except the first one) is the harmonic 
mean of the two nearby terms. Though the sequence of its terms converges to 0, this 
series is divergent. In fact, the sequence (S,,), of its partial sums verifies 


1 1 1 n 


1 
Ss S,= See eae ie. 
a " ee ae a To 5K 2: 


and thus it cannot be a Cauchy sequence. However, the divergence of the harmonic 
series is slow. See Exercise 7 at the end of Sect. 4.2. 
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In what follows, the term of nature of a series is understood as the property 
of being convergent or divergent. The next result analyzes the outcome of a finite 
number of changes. 


4.1.4 Lemma (a) /f in a series we permute finitely many terms, its nature remains 
the same (as well as its sum, if the series is convergent). 

(b) If in a series we modify finitely many terms, its nature remains the same (but 
the sum may change). 


The permutation of infinitely many terms may change not only the sum, but also 
the nature of a series. Fortunately, this does not apply to series with positive terms. 
See Sects. 4.3 and 4.4 for details. 

A sequence of complex numbers is convergent if and only if it is a Cauchy 
sequence. Accordingly, a numerical series is convergent if and only if the sequence 
of its partial sums is a Cauchy sequence. This fact can be reformulated as follows: 


4.1.5 Theorem (Cauchy’s Criterion for Series) A series >) ,.9 Zn is convergent if 
and only if for any ¢ > O, there is an index N such that for alln => N and all p = 0, 
we have 


n+p 


> Z| < €. 


k=n 


Cauchy’s Criterion has important consequences, for example, in many cases it 
reduces the problem of convergence of a series to the similar problem for a series of 
positive real numbers. The key notion in this respect is that of absolutely convergent 
series. 

A series is called absolutely convergent if the series of the absolute values of its 
terms is convergent. 

The following result represents a generalization of the absolute value inequality. 


4.1.6 Theorem Every absolutely convergent series >”... Zn is convergent and 


00 00 
> Zn| » lZn|- 
n=0 


n=0 


Proof Since the given series is absolutely convergent, by Cauchy’s Criterion, for any 
€ > 0, there is N € N such that 
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for all n > N and all p > 0. Then, by the absolute value inequality, we get 


n+p 


yz <€ 


k=n 


for all n > N and all p > 0. Using again Cauchy’s Criterion, we obtain the conver- 
gence of the series pies Zn. Finally, 


fone n 

> Zn| = | lim > Ze) = lim > Zk 
n—->0o n—>0o 

n=0 k=0 


k=0 
n oe) 
< lim 2 zx =, Izxl . 
n—-oo 
k=0 k=0 


There exist convergent series that are not absolutely convergent. Such series are 
_4yyntl : 
Ee) See Exercise 2(b) at 


n=1 ni 


called conditionally convergent. An example is >~ 
the end of Sect. 2.3, or Corollary 4.3.2. 

The sum of two series ASO Un and Dash Un is the series whose general term is 
Un + Un. If both series are convergent, then the sum series is convergent and 


[o,e) [o,2) 


> (Un +n) = >) Un+ >) Un. 
n=0 


n=0 n=0 


The multiplication of a series by a number is defined by 


Qa: > Un = > Uy. 


n>0 n>0 


If the series >"... Un is convergent, then so is the series a - >),,.9 Un, and 


CO foe) 
> Au, =a: > Un: 
n=0 n=0 


The product of two series >”. Un and >°,,.9 Un is the series >°,.9 Wn, Where 


n 


Wn = > UkUn—k- 


k=0 


4.1.7 Theorem (F. Mertens) /f the series >°,.¢ Un and >\)39 Un are convergent 
and at least one is absolutely convergent, then their product series is convergent and 
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Proof Let Un = >-p—9 Uk, Van = d-p—o Vk and W, = >o7_0 we. We have to prove 
that U,V, — W, — 0. We assume that the series neo Uy is absolutely convergent 
and put S = S°P°9 |vg|. Since the series 5°.) un is convergent, its partial sums 
are bounded, that is, 


M = sup |U,| < co. 
neN 


Let e > 0. Since dl Uy iS convergent and 2 i Uy is absolutely convergent, 
by Cauchy’s Criterion, there is an index N such that 


n 


ym 


k=m 


< 


CO 
€ € 
eens d SF 
Te eae a mel < FED 
for alln > m > N. Then, forn > 2N, we get 


[Un Vn — Wal = |p +++ + ug) Un + U2 +e + Un) Up—y + + Un] | 
< uy +--+ + unl + up| + lug +--+ + unl luni] +--+ + [unl - lvl 
= |Un — Uo| + |Unl + (Un — U1) > |Un-1| + +++ + |Un — Un—-n I: |u| 


n n 

+ > k\-|vn—-1]+-+-+ a: -|vi| 
k=n—N+2 k=n 

n N-1 
<2M. UK| + v 

2 Ive sep Ml 

€ € 

<2M. + -S<e, 


4(M+1) 2(S+1) 


which concludes the proof. 


The hypothesis that at least one of the two series is absolutely convergent is 
fundamental in Theorem 4.1.7. See Exercise 5. 


Exercises 
1. (Telescoping series). Let (a,), be a sequence of complex numbers. Prove that the 
series >*,,.9(Gn — Gn+1) is convergent (and has sum S = ao — L) if and only if 
the sequence (dy) 1s convergent (to L). 
2. Determine the nature of the series and compute the sum when it exists: 
1 , 
(@) Dnt cit 
1 ny 
(b) pee (1 7 4 ? 
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CDSE 
(dg) U4 eb Bx? a? hee fore = 1, 1): 
(e-) 2+ (2-3)x + (3-4)x74+--- forx € (—1, 1). 

3. Using Cauchy’s Criterion 4.1.5, prove that the series >°,,.9 dnz” is absolutely 
convergent whenever (a,), is a bounded sequence of complex numbers, z € C 
and |z| < 1. 

4. Use Theorem 4.1.7 to obtain the equality 


lee) l lee) (—1)" 
(> 3)(]S7)- 


which gives us the formula e~! = 1°25 


5. One can prove that the series > is convergent. See Theorem 4.3.2. 


n>1 Jn 
Prove that the square of this series is a divergent series. 

6. Prove the following companion of Theorem 4.1.7: If the series }°,,.9 un and 
> 50 Un are absolutely convergent, then the product series is also absolutely 
convergent. 


4.2 Series of Nonnegative Numbers 


The partial sums of a series Dash dn Of nonnegative numbers are increasing. As a 
consequence, the convergence of such a series is equivalent to the existence of an 
upper bound for the partial sums. 


4.2.1 Theorem (Comparison Test) Let >°,.9 dn and >) ,59 bn be two series of 
nonnegative numbers for which there exists N € N such that 


a, <b, foralln=>N. 


(a) Ifthe series >°,,.9 bn is convergent, then the series >”,.¢ dn is also convergent. 
(b) If the series >°,,..9 An is divergent, then so is the series >-,..9 bn. 


Proof According to Lemma 4.1.4(b), we may assume that N = 0. 
Let An = Spo ax and By = > 40 be. By the hypothesis, 0 < An < Bn for 
all n. 

(a) If the series >°,.¢ bn is convergent, then the sequence (B,), is bounded 
above, and the same is true for the sequence (A,)n. This implies the convergence of 
the series )),,.9 dn. 

(b) If the series so dy is divergent, then A, — oo and thus B, — oo. This 
implies that the series > +¢ 5n is divergent. 
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Since [x] < x < 2|x] for every x > 1, an immediate consequence of 
Comparison Test is that the study of the nature of positive series is very close to 


that of harmonic series )7,,.; +. 


4.2.2 Corollary Jf (aj), is an unbounded sequence of real numbers belonging to 
[1, 00), then the series >* + and > Car have the same nature. 


It is worth noticing the following multiplicative variant of Comparison Test: If 
> n>0 &n and >°,,59 bn are two series of positive numbers such that 


F an 
lim — é€ (0,0), 
noo n 


then the two series have the same nature. 


By the Comparison Test, from the convergence of the series >° we 


1 
n=1 n(n+l)’ 
obtain the convergence of the series >"... ; oa: 


The Comparison Test is the starting point of many other criteria of convergence. 


4.2.3 Theorem (Cauchy Condensation Test) Let >”... dn be a series of nonnegative 
numbers such that (dn)n is decreasing. Then >~,,..9 Gn is convergent if and only if the 
series > ,,59 2" a2 is convergent. 


Proof Let 


n 


n 
S.= > a and t= ¥ Pan, 


k=0 k=0 


We have the following inequalities: 


Sn < Son) = ag + a1 + (a2 + 43) + (ag + +++ +07) ++ + (Qgn-1 +++ + agn_1) 
—— —+S <4 a 
2 terms 22 terms 2"-! terms 


<ag +a, +2a2 + 27 ay tere i ee =ajgt+Th-1 
and 


Sgn = ag + ay + a2 + (G3 + 44) + (G5 + +++ +g) ++ + gn-14 +++ + aan) 
pei! _——— 


2 terms 2? terms 2"-! terms 


> ay + ay + a2 + 2ag2 + 27093 +--+ 2" apn > Th. 


Nile 


They show that the increasing sequences (S;,), and (T,,)n are simultaneously bounded 
or unbounded, hence they have the same nature. 
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A consequence of Condensation Test is the fact that the generalized harmonic 
series, 


ys 
a 
n>1 . 
is divergent if a < 1 and convergent if w > 1. In fact, if a < 0, then (1/n™)n>1 
is not convergent to 0, and thus the series is divergent; see Lemma 4.1.2. If a > 0, 


then the sequence (1/n%)n>1 is decreasing to 0 and the Condensation Test implies 
that the series >°,,., 1/n® has the same nature with 


oo [o,@) 


Yas - > smd (ea) 
Qna : Q(a—-I1)n Qa-1 ’ 
n= 


n=] n=1 
that is, with the geometric series of ratio 1/2%~!. 


4.2.4 Theorem (The Root Test) Let peer ay be a series of nonnegative numbers. 


(a) The series is convergent if lim sup 2/a, < 1. 


noo 


(b) The series is divergent if lim sup %/ay, > 1. 
n—>oo 


(c) The case limsup 2/a, = | is inconclusive. There are both convergent and 
n—>oo 
divergent series of positive numbers verifying it. 


Proof (a) If L = lim sup 4/a, < 1, we may choose ¢ > 0 such that L + ¢ < 1. By 


noo 
Exercise 2 at the end of Sect. 2.6, we infer that a, < (L + ¢)”, except for finitely 
many terms. Taking into account that the series >°,,.9(L + €)” is convergent, we 
infer from the Comparison Test that the series >”... dn is convergent too. 
(b) When lim sup 2/a, > 1, by the same Exercise 2, we deduce that a, > 1 for 


n—- Ooo 
infinitely many indices n, so the sequence (a,), cannot be convergent to 0. This 


yields the divergence of the series >7,..9 dn- 


(c) See the case of generalized harmonic series > a where lim “/a, = | for 
n 


all p € R, despite the fact that convergence occurs only when p > 1. 


4.2.5 Corollary Let >). Gn be a series of nonnegative numbers such that the limit 


L= lim 2/dy exists. 
n—-> oo 


(a) If L <1, then the series is convergent. 
(b) Jf L > 1, then the series is divergent. 
(c) The case L = 1 is inconclusive. 


4.2.6 Theorem (The Ratio Test of J. D’ Alembert) Let 7,9 Gn be a series of pos- 
itive numbers. 
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(a) flim sup etd, < 1, then the series >"... dn is convergent. 
n — 
nao 


% *. an+1 “ < e 
(b) if lim inf oo 1, then the series Dad An is divergent. 


(c) The test is inconclusive when lim inf ““*+! < 1 < lim sup “". 
n>oo & fiasco. 2" 


Proof (a) If L = limsup a < 1, we may choose ¢ > 0 such that L + ¢ < 1. 
n> oo 
According to Exercise 2 in Sect. 2.6, there must exist a natural number WN such that 


eet <L+eforalln > N. Then 


av+i<(L+e)an, ayy. <(L+e)ayy1 <(L+e)ay, ayy3 <(L+e)°an, 


and so on. Since the series >”,.9(L + €)"ay is convergent, the Comparison Test 
implies that the series >°,,.9 dn is convergent too. 


(b) If liminf “++ > 1, then the same argument shows that “** > 1 for all 
n> oo n n 


n > N. The sequence (a,,), cannot converge to 0 and this implies the divergence of 
the series 7.9 dn. 


(c) See the case of generalized harmonic series >” oe where lim atl = | for 
noo nm 


all p € R, despite the fact that convergence occurs only when p > 1. 


4.2.7 Corollary Let >),.9 an be a series of positive numbers such that the limit 


“ a, % 
L= lim @ exists. 
n—-> oo n 


(a) If L <1, then the series pee dy is convergent. 
(b) If L > 1, then series ba An is divergent. 
(c) The cases L = 1 is inconclusive. 


If lim ath = 1, then lim ’/a, = 1. Using Exercise 5 at the end of Sect. 2.6, 


n>oo “ ‘ n—>o — : : . 
one can prove easily that whenever the Root Test is inconclusive, so is the Ratio 


Test. However, there are examples where the Ratio Test fails and the Root Test is 
conclusive. See the case of the series baer, dy, Where 


gn+l gn 
and a2n41 = ce for alln EN. 


an = 3n 


When the Root Test and the Ratio Test both fail, one can try the following con- 
vergence test. An example is provided by Exercise 2(b). 


4.2.8 Theorem (The Raabe-Duhamel Test) Let >, dn be a series of positive 


numbers for which the limit 
. a 
lim n ea 
n—>0o an+1 
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exists and equals L. If L > 1, then the series is convergent, while if L < 1, then the 
series is divergent. 


Proof Assume that L > | and let L > a > 1. Since 


: an 
lim n —l)=L>a, 
n—->oo an+1 


there is an index N such that 


for alln > N, that is, 
nan — (N+ Vangt > (@ — Dans 
for alln > N. As an immediate consequence, we infer that the sequence (ndy)p 


is decreasing, starting with the rank N. By the Monotone Convergence Theorem it 
follows that this sequence is convergent. Then, the series 


> nay — (A+ Wanq1 


a—l 
n>N 


converges and the Comparison Test implies that the series >". y dn41 converges 
too. The convergence of the series >°,,..9 dn follows now from Lemma 4.1.4. 
Assume now that L < 1. A similar argument shows that the sequence (nay) is 
increasing starting with a suitable rank N. Then, a, > (NV an) for alln > N, and 
the Comparison Test implies that the given series is divergent. 


There are many other convergence tests for series. One of the most powerful is 
the integral test, which as the name says, requires integral calculus. See Exercise 7 
at the end of Sect. 10.1. 


4.2.9 Remark Using the theory developed in the last two sections, one can introduce 
easily some of the most basic functions of complex variable as sums of convenient 
absolutely convergent series: 


the exponential function of base e, 


22 


ra 
Ee: Taite tot. 
n! 2! 
n=0 
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the sine function, 


oo znrl 23 2 
fo! 2 YY ae BE 
the cosine function, 
oo 2n 2 4 
z z z 
cosz = 1)” =1 
. 2 ( Omi at a 


In all the three cases, the Ratio Test assures the absolute convergence for all z € C. 

The sine function is odd, while the cosine function is even. 

Clearly, e° = 1, and e! = e, while the Theorem of Mertens easily yields the 
fundamental property of the exponential, 


e1F22 — e%le%2 forall zy, z2 €C. 


The exponential function establishes a bijection between R and (0, co), whose 
inverse is the logarithm function log. The extension of log to complex values of the 
argument is a delicate problem, presented in the books on complex analysis. 

The power function x“ (of real exponent a) is defined via the formula 


4 — etl08x for all x > 0. 


x 
Euler, who actually introduced the trigonometric functions for complex values, 
had noticed the formula 


e* =cosx+isinx forall xeR. 


Therefore 


2 


cos” x + sin? x = (cosx —isinx) (cosx +isinx) =e %e* = 0° =1 

for all x € R. It is worth noticing that all properties and formulas known from 
Calculus concerning the above functions can be retrieved easily from their definitions 
as sums of suitable absolutely convergent series, in particular, the periodicity of 
trigonometric functions and all trigonometric formulas. Full details are to be found 


in Chap. 7. 
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Exercises 


1. Precise which of the following series is convergent: 


(2n)! 
ODE OD TT ran rod( se 2). @ > me 


a0 nao | 


2. Leta > 0. Prove a, 
(a) the series )° 
(b) the series )° 


Hs 12 aE! ig convergent if a < e and divergent otherwise; 


n>0 ole baat is convergent ifa@ > | and divergentifa < 1. 


3. (Olivier’s Test). Let >°,,.9 an be a convergent series of nonnegative numbers. If 
(an)n is decreasing, prove that 


lim na, = 0. 
n—->oo 


See Exercise 4, Sect. 4.3, for an application. 
[Hint: Notice that 2nd2n < 2 (day + dan—1 + +++ + 4n41)-] 

4. If we drop the hypothesis on monotonicity, the result of Exercise 3 above may 
fail. Give an example and infer from Theorem 2.8.1 that any convergent series 
> ,>0 dn Of nonnegative numbers verifies the weaker condition na, — 0 in 
density. Show that this latter result actually implies Olivier’s Test. 

Remark The behavior of convergent series of nonnegative numbers from the 
perspective of higher order densities is presented in [1]. 

5. Let >°,9 Gn be a divergent series of nonnegative numbers. Prove that there 
exists a sequence (€,,), of positive numbers that converges to zero but for which 
20 Endn Still diverges. 


+ ae : Snt1—5, 
[Hint: Look at the series ban Tae “where S, =ag+- - ay.| 
6. Infer from the Cauchy Condensation Test that the series 5°, >2 wos x 7 is divergent, 


while the series )°,,.5 is convergent. 


1 
nlog?n 
7. (Nicole Oresme). If (S,,), denotes the sequence of partial sums of the harmonic 

series, show that n — 1 + wn > Son > 1+ 5 

[Hint: Adapt the proof of Cauchy Condensation Test. ] 


2 66 


8. (Kempner’s “no 9” series). Prove that the series 


ee +etatat +atptat 
1 2 8 10 18 20 

where the denominators are the positive integers that do not contain the digit 9, 

is convergent. 


The sum of Kempner’s “no 9” series is about 22.92067... . However, the conver- 
gence is so slow that the sum of all terms with denominators <10°° is still less 


98 


10. Let (aj), be any sequence of real numbers such that sup 
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than 22, so the accurate summation of this series needs special algorithms. See 
[2] for details. 

(A generalization of the Comparison Test). Let >7,,..9 ns Donso On and > nso Cn 
be three series of real numbers for which there exists N € N such that 


an < bn <cy, foralln > N. 


(a) Ifthe series >°,>9 4n and D°,>9 Cn are convergent, then the series >> bn 
is convergent too. 

(b) If the series >... An is divergent to 00, then so does the series >”... Dn. 

(c) Ifthe series ae Cn is divergent to —o0, then so does the series a bn. 


n 


> % 


k=0 


< oo. Use the 


n 
inequality : : 
: 3 
a ee ara for x € [—1, 1], 


to infer from the preceding exercise that the series >°,,., [exp (44) - 1] 


converges for a > 1/2 and diverges to 00 if 0 < a < 1/2. The convergence is 
absolute if and only ifa > 1. 


4.3 Abel-Dirichlet Test and Its Consequences 


The following convergence test, based on Cauchy Criterion, applies to series that are 
not necessarily absolutely convergent: 


4.3.1 Theorem (Abel—Dirichlet Test) Let (a,), be a sequence of real numbers 
decreasing to 0, and (Zn)n a sequence of complex numbers whose partial sums 
SS ieo Zk form a bounded sequence. Then the series psa AnZn iS convergent. 


Proof Let Sy = zo+-:-:+Zn. By the hypothesis, there is M > 0 such that |S,| < M@ 
for all n. Then 


|anZn + GngiZn41 ++++ + GntpZn+p| 

= lan (Sn — Sn=1) + Qngi (Sng — Sn) + +++ + Antp (Sn tp — Sn+p 1)| 
= | — an Sn—1 + Sn (Gn — Gn41) + Sn41 (Gn+1 — Gn42) ++ °° 

+ Sntp i (deep 1 — Gntp) + An+pSn+p! 

< May + M (aq — Gn41) + M (Gn41 — Gn42) + ++ 

+M (Gn+tp-1 = tinea) + Man+p 

= 2May, 
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for all n, p € N. Since a, — 0, for each e > 0 we can find an index N such that 


E 
a,< =: 
n 2M 


Therefore |dyZn + Qn4iZn+1 + +++ + antpZntp| < € for alln = N and p = 0, 
so that, by Cauchy’s Criterion 4.1.5, we conclude that the series > as0 AnZn 1S 
convergent. 


A series of real numbers 5°, y Un is alternating if sgnuy sgnun+1 < 0 for all 
indices n. A typical example is the alternating harmonic series 


7 (ire 24 La is 
eo -° 2° a a 
n=1 
whose convergence (and computation of its sum S = In2) was the object of 


Exercise 2, at the end of Sect. 2.3. The alternating harmonic series is not absolutely 
convergent since the harmonic series is not convergent. 

A useful convergence test for alternating series is provided by the following con- 
sequence of the Abel—Dirichlet Test: 


4.3.2 Theorem (Leibniz Alternating Series Test) If the sequence (dy), decreases to 
0, then the series paee (—1)” ay is convergent. 


We can provide a direct argument for the Leibniz Alternating Series Test, that 
also yields information how well partial sums approximate the sum of the series. 


4.3.3 Proposition Let Dd (—1)" a, be a series as in Theorem4.3.2, having the 
sum S. Then, the error made by replacing the sum S of the series by one of its partial 
sums does not exceed the absolute value of the first term omitted, that is, 


n 
S— S01! ax) < any. 
k=0 
Proof First, notice that the subsequence (S2,),, of even ranked partial sums, is 


decreasing and the subsequence (S2y+1),, of odd ranked partial sums, is increasing. 
In fact, 


Son42 = S2n — G2n41 + Gan42 = Son — (Gant — A2n42) < S2n 


as (dn)n is decreasing. In the same way, 


Santi = San—1 + A2n — A2n41 = S2n-1- 
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Moreover, S2n41 = Son — d2n+1 < S2n (since the terms of (a,), are positive). 
Then, 


S, < 83.5 +++ S Song S San S +++ XS Sa S S2 < So 
and the Monotone Convergence Theorem implies that both ($2,,),, and (S2n41), are 
convergent. Since S2n41 = Son — dan41 and limy—.o0 dy = 0, the two subsequences 
have the same limit, from which we infer the convergence of the series. Let S' be its 
sum. Since ($27), is decreasing and ($2,+1)n is increasing, we get 


Song. < S < Son forall n. 


Then 
0 < Son — S < Son — Santi = G2n41; 


and 


0 < S— Son41 < S2n42 — Santi = 42n42- 


4.3.4 Corollary If0 < x < V6, then x — x <sinx <x. 


Therefore, sinx < x ifx > Oandsinx > x ifx < 0. As aconsequence, the equation 
sinx = x 
has a unique solution, x = 0. 


Proof In fact, if 0 < x < ./6, the series pan (—1)” a defining sin x verifies 
the hypotheses of Proposition 4.3.3 and thus its partial sums verify Sj < sinx < So. 
Therefore 
x3 
x—-— <sinx <x 
3! 


for all x € (0, 6). 


3 : 
We will prove later that x — 4; < sinx < x forall x > 0. 


The order of summation of a series could be important. We shall illustrate this 
idea in the case of the alternating harmonic series. Denote by S its sum ($ > 1/2 by 
Proposition 4.3.3), and suppose that the partial sums can be computed by taking the 
terms in an arbitrary order. Then, 


lo) 
1 1 1 i 
s= (a5 i” tp x) 


n=1 


and also 
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. 1 1 
s=> (54-5). 


n=1 


Therefore, 


eee ee =§ 
-_ 3 2 5 7 4 er 


which is a contradiction. 
As we will show in the next section, the order of summation affects all condition- 
ally convergent series. 


Exercises 


1. Suppose that z € C\R. Prove that the sequence 
z1 = land z, = 14+ sgn(z) +--- + sgn(z"~!) forn > 2, 


is bounded and infer from the Abel—Dirichlet Criterion the convergence of the 


series 57,50 seule y 
2. (The importance of the order of summation). Consider the alternating harmonic 


series 


1 1 1 
2 3 ; 

and a new rule of computing the partial sums: Sj is the sum of the first p positive 
terms, Sz adds to this the sum of the first g negative terms, then S3 adds the next 
P positive terms and so on. Prove that (S,,), is convergent to In2 + 5 In Z, 

3. Let (ay)n>1 be a decreasing sequence of positive numbers, that converges to 0. 
Prove that the series >° n>| Gn and > n>1 (dn — Gn+1) have the same nature (and 
the same sum, if they are convergent). In particular, 


4. (An extension of Olivier’s Test for series of complex numbers [1]). Suppose 
that (a,)n is a decreasing sequence converging to 0 and (Z,), is a sequence of 
complex numbers such that the series 5° ayZ, is convergent. Prove that 


n 
in (Sa)a=o 


k=1 
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The trick of changing the order of summation in a finite sum (that was used in 
the proof of Abel—Dirichlet Test), can be reformulated in a more convenient way 
by using the forward differences Ax, = Xn41 — Xp associated to the numbers 
x0, X1,X2,.... Precisely, 


n—l n—| 
> (Aux) vf = UnUn — UmVm — > U4 1 AUR, 
k=m k=m 
for any two numerical families um, Um+1,...,Un and Um, Um+1, ---) Unt1- This 


formula (known as Abel’s Partial Summation Formula), is the analogue, for 
series, of the integration by parts. 


. Assume the series >",,.9 Un is convergent and the sequence (vy), is monotonic 


and bounded. Use Abel’s partial summation formula to prove that the series 
> 50 UnUn is convergent too. 


. (A Raabe-Duhamel Test). Consider the alternating series >, >o(- 1)"an, where 


a, > 0 for all n. Prove that this series is convergent provided that 


; a 
lim n " _1)>0. 
n> oo an+1 


. For which values of the real parameters p and gq is the alternating series 


n'in@ 
n>1 


convergent? 


. (Euler’s transformation of series). Prove that 


(ee) 


lo) n 
= 1 n 
Soya Sg Sent ou 
k=0 


n=1 n=0 


provided the left-hand series is convergent. 

(An identity due to Euler, that relates series expansions with continued frac- 
tions). Prove, by mathematical induction, that for every nonzero real numbers 
a1, 42,...,dn, we have 


2 
| 


* + an -4n_-2+ 
n—1~9n-2° ana, 7 


Then, apply this formula to compute In 2 (the sum of the alternating harmonic 
series) with three exact decimals. 
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10. (Hardy’s Tauberian Theorem). Let (a,), be a complex sequence. Put 


n 


1 n 
= 20 and o, = iz 


Prove that (S;,)n is convergent if (o,), 1s convergent and supn |a,| < oo. 


Remark According to Corollary 2.7.3(a), if (a;)y is a sequence of positive num- 
bers, then the convergence of (S,,), implies the convergence of (o,);. 


4.4 Unconditionally Convergent Series 


The order of summation proves to be unimportant for the absolutely convergent 
series. More precisely, any absolutely convergent series >°,,. 9 dn is unconditionally 
convergent in the sense that whenever z : N — Nis apermutation (that is, a bijective 
function), the series >°,,.9 dz(n) is also absolutely convergent and 


oo fee) 
Dian = Darin 
n=0 n=0 


This fact is immediate, via the concept of summable series. 


4.4.1 Definition A series sp ay is summable (with sum S) if for each e > 0, 
there is a finite subset F, C N such that 


S— > an 


néF 


<€é 


for all finite subsets F C N with F D Fy. 


4.4.2 Lemma An absolutely convergent series is summable. 


Proof Suppose that >*,,.9 an is an absolutely convergent series and S is its sum. 
Let ¢ > 0. Since the series is absolutely convergent, there is an index N such that 
nen lanl < €. Put F, = {0, 1,..., N}. Then, for every finite subset F C N with 
F > F,, we have 


S— oan 


neF 


a, 


n¢éF 


< lanl < So lanl < ¢. 


ngéF n¢ Fy 


The converse of Lemma 4.4.2 is also true. We will need the following fact: 
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4.4.3 Lemma Let >°,,..¢ dn be a series for which there exists a positive constant C 
such that [Done An| < C for all finite subsets F C N. Then, 


DY lanl < 4€ 


neF 
for all finite subsets F CN. 


Proof First, consider the case where all numbers a, are real. In this case, each finite 
subset F C Ndecomposes into F* = {n € F : ay > O}and F~ = {n € F : ay < O}. 
Then, 


> lanl = D2 an— D5 an =| >) ant +] >> an 


neF neFt ne F- neFt neF- 


> an 


neG 


=< 2C. 


< 2 max 
GCF 


In the complex case, use the fact that |Im z| < |z| and |Rez| < |z|, and apply the 
preceding inequality for Rea, and Im ap. 


4.4.4 Theorem (Dirichlet’s Summability Theorem) Lert Ca, ay be a series. Then, 
the following assertions are equivalent: 


(a) The series is absolutely convergent; 
(b) The series is summable; 
(c) The series is unconditionally convergent. 


Proof The fact that (a) implies (b) was proved in Lemma 4.4.2. 

(b) > (c) Let z be a permutation of N and let e > 0. By the hypothesis, there is a 
finite subset F, C N such that |s -> An| < é for all finite subsets F C N with 
FD F,. Let 


neF 


N = max fr") tke F,}. 


Then |s - SaGrw| <eforalln>N. 


(c) = (a) Notice first that a series >”,,..9 Zn is absolutely convergent if and only if 
for any finite subset F CN, the series >. Fin is absolutely convergent too. 

Suppose now that >°,,.9 dn is an unconditionally convergent series. If this series 
is not absolutely convergent, then Lemma 4.4.3 and the previous remark imply the 
existence of a sequence of pairwise disjoint finite subsets F,, C N such that 
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Dd au >1 foralln. (4.1) 
keFn 


Let F, = {kno — Kn, and N\ U,, Fn = {ro. 11,72, ...}. By the hypothesis, 
the series 


Aky,g ++ F Ako. peo) + Gro + Akio H+ + ky pay Fan Fo 


must be convergent, while (4.1) shows it fails the Cauchy Criterion of convergence. 
Consequently, the series >°,,.9 dn is absolutely convergent. 


The series that are convergent but not absolutely convergent are called condition- 
ally convergent. So is the case of alternating harmonic series. An important result 
concerning this type of convergence is stated as Exercise 3. 


4.4.5 Remark A sequence (Zn)n of complex numbers is called absolutely summable 
if the associated series Dae Zn is absolutely convergent. We will denote by £!(N) 


the complex linear space of all such sequences (and by £p(N ) its real companion). 
The natural norm on these spaces is 


CO 
adel > lanl: 
n=0 


With respect to this norm, both spaces é!(N) and lp (N) are Banach spaces. See 
Exercise 4. Moreover, as in the case of ee (N), the space as (N) is a linear lattice of 
functions with respect to the ordering by components, 


(Xn)n < On)n if and only if x, < y, for alln. 


Courses on integration theory present the absolutely summing sequences (Z,,) as 
integrable functions defined on N, and their sums pied Zn, as integrals. 
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Exercises 


1. One can prove that }°°° | 4, = a See Theorem 8.7.2. Infer that 


acd 2 


—4)jrtl 2 60 1 
» at ae a and ~ — 
n2 12 a (2n + 1)? 8 


n=1 


2. Let >°,,59 dn be a convergent series of real numbers. Denote the positive terms (in 
order) as bo, b1, bo, ... and the other terms as co, c1, C2, ... . Prove the following 
statements: 

(a) If the given series is absolutely convergent, then so are both >°,,..9 bn and 
Dn>o Cn- Moreover, 177-9 dn = Deno On + Dino Cn- 

(b) If >°,,+9 an is conditionally convergent, then S77 9 by = 00 and 1. 9 cn 
= —00, 

3. (The Rearrangement Theorem of B. Riemann). Infer from the previous exercise 
the following result: Jf >°,,.9 dn is a conditionally convergent series of real num- 
bers and S € R, then there is a permutation 1 of N such that pa An(n) = S. 

4. Consider the linear space oe (N), of all absolutely summable sequences of com- 
plex numbers (endowed with the coordinatewise algebraic operations). Prove 
that: 


(a) ep(N ) is complete with respect to its natural norm 


loo) 
Znall, = > IZnls 
n=0 


(b) lp (N ) is not complete with respect to the supremum norm 


I(Znnlloo = sup [Znl; 
neN 


(c) lp (N ) is a linear lattice of functions with respect to the ordering 
(Qn)n < (bn)n if and only if a, < b, for all n. 


5. Prove that the Bolzano Weierstrass Theorem fails in ¢!(N). 


4.5 Notes and Remarks 


The summation of certain geometric series is known since the times of Archimedes. 
The divergence of the harmonic series was established by Nicole Oresme (c. 1360). 
He also showed that 
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The summation problem of different series has received a great deal of attention 
from many mathematicians starting with the 18th Century. However, the rigorous 
study of numerical series begins with Carl Friedrich Gauss (1812), Bernhard Bolzano 
(1817) and Augustin-Louis Cauchy (1821). 

In the following chapters, we will present some other convergence tests for nu- 
merical series, and also some methods (based on calculus) for a higher precision 
computation of the sum of a convergent series. 

Recently, Elijah Liflyand, Sergey Tikhonov and Maria Zeltser [3] have noticed 
that the monotonicity assumption for the sequence of terms in some convergence 
tests can be relaxed by using Leindler’s concept of weakly decreasing sequence. 
For example, this applies to Cauchy Condensation Test and Olivier’s Test, but not 
to Leibniz Test. Recall that a sequence (a,), of nonnegative numbers is weakly 
decreasing if for some positive constant C, it satisfies the condition 


ar < Ca, for every k € [n, 2n] and everyn € N. 


(349) 
n 
unknown. After 107 terms, the series equals approximately 2.163... . See the book 

of Jonathan Borwein, David Bailey and Roland Girgensohn [4] for more details. 
The combinatorial properties of divergent series are yet not well understood. As 
follows from Corollary 4.2.2, this problem has an arithmetic character. 


Surprisingly, the nature of some positive series like >’. ; are still 


4.5.1 Open Problem (Paul Erdés) Suppose that (a,), is a strictly increasing se- 
quence of positive integers such that pa 1 = oo. Does (day), contain arbitrarily 
long finite arithmetic progressions? 


This problem was solved in the affirmative in two important particular cases. 
The first case (settled by Endre Szemerédi [5]) refers to the sequences (a;), having 
positive upper density in the sense that 


l{k: ax €[U,n}}| 
lim sup > 0. 
noo n 


oo 1 
n=0 an 


Leonhard Euler has proved in 1737 that the series >” p=prime 5 of reciprocals of 
prime numbers is divergent. A nice proof (due to Erdés) is given in [6]. According to 
an old result due to Pafnuty Lvovich Chebyshev, if m(n) = #({p <1: p prime}), 
then 


One can easily prove that >” = oo for any such sequence. See [1]. 


mr (n) 


7 9 
=< <<, 
8 n/Inn 8 
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and thus the set of prime numbers has zero upper density, that is, 


Ben Joseph Green and Terence Tao [7] proved the existence of arbitrarily long 
finite progressions of prime numbers. 

How slowly can a series converge or diverge? The following two striking examples 
were first noticed by Godfrey Harold Hardy [8]: 


oe 2 ante 


1 
wee 2 inn (nina) 


ninn arene 


The series (a) converges to 38.43... but so slow that we need to sum up its first 


86 : : . . . 
10°.!410™ terms to get the sum with two exact decimals. The series (b) is divergent. 


: . 100 
However, its partial sums exceed 10 only after 10'°” terms. The argument needs 
the Euler-Maclaurin formula. See Sect. 10.2. 
Related to series are the infinite products . If (an)n is a numerical sequence, we 


put 
[o,e) 
a=|Jau 
k=0 


n 
lim I] a, =a. 
n—-oo 
k=0 


An infinite product of the form []72.9(1 + xx) with all xz > 0 is convergent if and 
only if the series >" x, is convergent. See the inequalities 


i-ys = Tle +x) < o0( Sx) 


k=0 k=0 


A celebrated example is Wallis’ product, 


2n— 1 On 41 : , 


See Exercise 8 at the end of Sect. 9.2 for a short proof based on integral calculus. 
Wallis’ product is a particular case of Euler’s product formula for the sine function: 


for all real numbers x, 
snv=x TT (1- gaa): 
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This makes the objective of Theorem 8.7.1. 

More details on the infinite products can be found in the book of Konrad Knopp 
[9]. 

As was noticed by Ernst Steinitz, the rearrangement theorem of Bernhard Riemann 
(see Exercise 3 at the end of Sect. 4.4) has acompanion for series of complex numbers: 
If neo Zn is a conditionally convergent series of complex numbers, then the sums 


of the convergent series >°°° 9 Zn(n) (where x is a permutation of N) constitute a 
subset X of C such that 


u,v € Xandid € Rimply(1 —A)u+ave X. 
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Chapter 5 
Metric and Topology 


Many problems in mathematical analysis lead to certain questions relative to the 
nature of suitable sets. For example, this is the case of the subject of continuity, 
that will be detailed in the next chapter. Our goals here are restricted to a succinct 
presentation of the notions of distance and vicinity, as well as of some basic results 
concerning the relative position of the different subsets of an Euclidean space (most 
of the time R or C). 


5.1 Metric Spaces 


A careful inspection of the theory developed in the last two chapters, easily reveals 
that the notions of bounded set, convergent sequence and Cauchy sequence can be 
perfectly described by using the Euclidean metric between points in R? rather than 
Euclidean norm. This metric is defined by the formula 


Pp 1/2 
d:R? xR? +R, d(x,y)=|x-yll= (> (ay = w?) ; 
k=1 


and verifies the following four basic properties: 


(M1) d(x, y) = 0 

(M2) d(x, y) = Oif and only ifx = y 
(M3) d(x, y) = d(y, x) 

(M4) d(x, y) < d(x, z) + d(z, y) 


whenever x, y, z € R?. The inequality (M4) is known as the triangle inequality. 

While the Euclidean structure presupposes the presence of a linear structure, the 
properties (M1)—(M4) make sense even when we restrict ourselves to any nonempty 
subset A of R?, endowed with the induced metric d| 4x4. 
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5.1.1 Definition Given a set M, we call a metric on M every function 
d:MxM—R 


that verifies the properties (M1)—(MA4) above. 


A metric space should be understood as a pair (M, d), consisting of a set M and 
a metric d on it. The elements of a metric space are called points. When there is no 
ambiguity, a metric space (M, d) will be denoted simply M. 

Since the role of a metric is to measure the distances between the different pairs 
of points, the metrics are also called distances. 

In any metric space, one can consider some special subsets called balls. 

The open bail of center a and radius r > 0 is the set B,(a) given by 


B-(a) = {x eM: d(a,a) <r} 
and the closed ball of center a and radius r > 0 is the set B, (a) given by 
B,(a) ={x EM: d(a,a) <r}. 


In the case of the Euclidean space R, the open ball B,(a) = {x: |x —a| <r} 
coincides with the open interval I-(a) = (a —r,a-+r) of center a and radius 
r > 0. Similarly, the closed ball B,(a) = {x: |x —a| <r} coincides with the 
closed interval [,.(a) = [a — r,a +r] of center a and radius r > 0. 

It is worth noticing that every nonempty interval (a, b) can be described as an 
interval J,.(z), and every interval [a, b] can be described as an interval I,(2): 


(a, b) = Toa ("$®) and [a,b] = Toso ($*). 


z 
Moreover, 


I.(a) C I,(a) forall 0<r <s and everyaéR. 


In the case of Euclidean spaces R* and C, the balls are called discs and they are 
denoted by symbols like D; (a) and D;(a). 


5.1.2 Definition A subset of a metric space is called bounded if can be included in 
a suitable ball. 


According to this definition, all balls (open or closed) are bounded sets. 

If a subset A (of the metric space M) is contained in a ball B,(a), then for any 
other point a’, there is a strictly positive number r’ such that A is included in B,’(a’). 
Indeed, if x € A, then x € B,(a) and 


d(a,a’) < d(x, a) + d(a,a’) <r+d(a,qa), 


so that x belongs to the open ball B,.(a’) with r’ = r + d(a, a’). 
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By definition, the diameter of a bounded set A is the number 
diam(A) = sup{d(z, y): x,y € A}. 


Clearly, when A is bounded, it is contained in any closed ball Baecat A)(x), centered 
at a point of A. 

Other important examples of bounded sets in a metric space M are the spheres. 
The sphere of radius r > 0, centered at a, is the set 


S-(a) = {a: d(x, a) =r}. 


When M = R? or C, the spheres are called circles and denoted by C,-(a). 

As we already noticed, every subset A of a metric space M = (M,d) can be 
endowed naturally with a structure of metric space, by considering on it the induced 
metric, d|4x A. In this case, we say that the metric space A is a subspace of the space 
M. The balls in A are precisely the intersections with A of the balls in “the big space” 
M. As aconsequence, a subset of A is bounded in A if and only if it is bounded in VM. 

We can compare the different metric spaces by using the isometries. An isometry 
from a metric space M = (M,d) to a metric space N = (N, p) is any function 
f :M — N such that 


p(f (x), f(y) =d(a,y) forall z,y eM. 


Necessarily, every isometry is a one-to-one map. 

We say that the two spaces M and WN are isometric when there is an onto isometry 
f :M — N. Two isometric metric spaces have the same metric properties and it is 
customary to identify them. As a consequence, the existence of an isometry f from 
M to N allows us to identify M with f(M) and see M as a subspace of N. We say 
in this case that M embeds into N and f is an embedding of M into N. It is worth 
noticing that every metric space can be viewed as a subspace of a suitable Banach 
space. See Remark 6.9.2. 


Exercises 


1. Notice that every set M can be seen as a metric space by considering on it the 
so-called discrete metric, defined by the formula 


40 if*=y 
OO Vi. teat a, 


What are the open balls in this case? What about the closed ones? 
2. Leta, y, x’, y’ be four points in a metric space. Prove the quadrilateral inequality, 


|d(z, y) — d(a', y)| < d(a, 2’) + dy, y). 
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In R?, this inequality may be interpreted as stating that the sum of the lengths of 
two opposite sides in a quadrilateral is greater than or equal to the absolute value 
of the difference in length of the other two sides. 

3. Suppose that E is a normed linear space. Prove that the translation of vector xo, 


Ty): E> E, Tx) (v) = %+ 20, 


is an isometry relative to the metric associated to the norm, d(x, y) = ||x — y||. 
4. Prove that all isometries f : R — R are of the form 


f(x) = ex +a, 


for suitable « € {—1, l} andaeR. 

5. Prove that the function p(z, y) = 
associated to any norm on R. 

6. Given a prime number p, every nonzero rational number z can be represented as 
c= sa where r and s are integers not divisible by p, and a € Z. This allows 
us to define the p-adic norm of a rational number by |x|, = p~° if x # 0 and 
|O|,, = 0. Verify that the p-adic metric dp(x, y) = |x — y|, is indeed a metric 


on Q. 


|z—y| 
T+[z—y| 


defines a metric on R, which is not 


5.2 The Topology of a Metric Space 


In every metric space, the open balls allow us to introduce several concepts intended 
to define the relative position of its different subsets from each other (regardless their 
form, size, or nature): neighborhood, interior, closure, boundary, etc. 

Although we are primarily interested in the case of the Euclidean metric space 
R, we will keep the entire discussion in the setting of an arbitrary metric space 
M = (M,d). 

The central concept is that of open set. 


5.2.1 Definition We say that a subset U of M is open if for every a € U there is a 
number r > 0 such that the ball B,(a) is contained in U. 

The family O of all open subsets of M is called the topology of M and the pair 
(M, ©) is called the topological space associated to the metric space M; in the case 
of IR? we speak about the Euclidean topological space R? . The topology associated 
to a discrete metric is called the discrete topology. 


The general properties of O are the following: 


(TOP1) % and M belong to O. 
(TOP2) Any union of elements of O is an element of O. 
(TOP3) Any intersection of finitely many elements of © is an element of O. 


The proof is left as Exercise 1. 
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a-r a+r 


Fig. 5.1 Open intervals are open sets 


5.2.2 Lemma An open ball is an open set. 


Proof Let x € B,(a). We will show that B-(x) C B,(a) for every ¢ in the interval 
(0, r — d(x, a)). In fact, if y € B-(x), then 


d(y, a) < d(y,2)+d(a,a) <e+d(a,a) <7, 


from which we conclude y € B,(a). 


An immediate consequence of Lemma 5.2.2 is that all open intervals in R are 
open sets. See Fig.5.1. 

Related to (TOP3), it is important to notice that an infinite intersection of open 
sets may not be an open set. See the case of R and of the sequence of open intervals 
(—1/n, 1/n), whose intersection is a one-point set, 


The open sets of R can be completely characterized in terms of intervals: 


5.2.3 Theorem Every nonempty open subset of R can be represented (in a unique 
way) as a countable union of mutually disjoint nonempty open intervals. 


Proof Let U be the open set under attention. For each x of U, we denote by [; 
the union of all open intervals that contain x and are included in U. According to 
Proposition 2.5.1, J, is an interval, necessarily open, as being a union of open sets. 
Clearly, J, is the largest interval containing x and included in U. As a consequence, 
if x A y, then either J, = Jy, or Jy N1y = Y. 


Notice that U = nent and that the set of distinct intervals /;, that appear in 
this representation is countable. In fact, according to Lemma 1.4.4 and the Axiom 
of Choice, we can choose in each interval J, a rational point r, and this yields an 
injective function J, — r,, from the set of distinct component intervals of U to Q. 

The uniqueness part is straightforward. 


The simple structure of open sets in R has no analogue in Euclidean spaces of 
dimension p > 1. See Exercise 11, at the end of this section. 

The notions that can be defined in the terms of a topology are called topological 
notions. 
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Fig. 5.2. The separation property 


In what follows, we will list several important topological notions that can be 
ascribed to the topology of any metric space M. 

A neighborhood of a point a € M isa set V C M that includes a ball B,(a) 
around a. We denote by V, the family of all neighborhoods of a. 

The basic properties of the families V, are as follows: 


(V1) Va 4 @ (for example, MeEV,) and every V € VY, contains the point a. 

(V2) If V € Y, and W 3D V, thenW € )y. 

(V3) The intersection of a finite number of neighborhoods of a is a neighborhood 
of a. 

(V4) Each neighborhood V ¢€ Y, includes a neighborhood W € V4, that is a 
neighborhood for each of its points. 


The proof is left as Exercise 2.A set that is a neighborhood of each of its points 
is precisely an open set. 

The topology associated to a metric space also satisfies Hausdorff’s separation 
property: For every pair of distinct points x, y € M, there are neighborhoods V;, € V; 
and V,, € V, such that 


VO Vy = @. 


See Fig.5.2. For example, we can choose V; = B;(x) and V, = B;(y), where r is 
any positive number less than d(z, y)/2. 

Related to the notion of neighborhood is that of interior point. 

Let A be a subset of M. A point a of A is said to be an interior point of A if A 
is a neighborhood of a. The set of all interior points of A is called the interior of A 


(and is denoted int A or A). Since 


int A = U U, 
UCA, U open 


it follows that int A is an open set and 
intA C A. 


Moreover, int A is the largest open subset of A and the equality A = int A occurs if 
and only if A is an open set. 
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A subset A of M is called closed if its complementary set is open. 
Recall De Morgan’s rules, 


0(Ua)= 0 


iel iel 
(s a) — U CA; 
iel iel 


which are true for all families of sets. From here, we can easily obtain the basic 
properties of closed sets: 


(C1) M and @ are closed sets. 
(C2) The intersection of any family of closed sets is a closed set. 
(C3) The union of a finite family of closed sets is a closed set. 


Notice that in general, the sets M and Y are simultaneously open and closed. In 
the case of R (endowed with the Euclidean metric), these are the only such subsets. 
See Exercise 8(b). 


5.2.4 Example (a) All finite subsets of R are closed. In fact, according to the property 
(C3), it suffices to consider the case of one-point sets. Or, for each a € R, the set 
R \ {a} is the union of two open intervals. More generally, in every metric space, the 
finite subsets are closed. 

(b) All closed intervals are closed sets (since their complements are unions of 
open intervals). In general, in every metric space, the closed balls are closed. 


We will next describe two new topological notions that parallel the relationship 
between open sets, interior of a set, and interior points. 
The closure of a subset A of M is the set 


A= [(). #& 
FDA, F closed 


Clearly, A is closed (as an arbitrary intersection of closed subsets) and 


ACA. 
It is easy to see that A is the smallest closed set containing A and that the equality 
A = A occurs if and only if A is closed. 
We say that a point a € M is aclosure point of A if 


VNAZD 


for every neighborhood V of a. By property (V4) above, the content of this definition 
is the same if we restrict to the open neighborhoods of a. 
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5.2.5 Lemma /f A is a subset of a metric space M, then 
A={aeM: aisaclosure point of A}. 


Proof Let B be the set of all closure points of A. If a € A, but a is not in B, then 
there must exist an open neighborhood V of a such that VM A = &. This implies 
that A is contained in CV, which is a closed set. Since a € A, this point must be in 
CV, a fact that contradicts the choice of V asa neighborhood of a. Hence ACB. 
Conversely, let a be a closure point of A and let F' be a closed set that includes 
A (equivalently, CF 1 A = @). We will show that a € F. In fact, if the contrary is 
true, CF would be an open set that contains a (and thus a neighborhood of a). Since 
a is a closure point, we must have CF 1 A 4 GJ, which contradicts the choice of F. 
Therefore A > B and thus A = B. 


De Morgan’s rules have a nice topological companion, relating the interior and 
the closure of a set. See Exercise 4. 

A word of caution concerning the closure of sets in arbitrary metric spaces: in 
Euclidean spaces, it is true that the closure of any open ball B,(a) is the closed ball 
B,(a) (and the interior of the closed ball B,(a) is the open ball B,.(a)). However, 
this is not true for all metric spaces. See the case of balls Bj (a) in a space endowed 
with the discrete metric. 

The boundary of a set A, denoted OA, is defined by 


OA = A\int A. 


The boundary of any bounded interval of endpoints a < b is the set {a, b}. The 
boundary of the closed unit disc D = D (0) in C is the unit circle 


S'={z:zeC, |z|=1)}. 


In general, the boundary of any set is a closed set. 


5.2.6 Remark (Equivalent metrics) It is possible that two different metrics d; and dz 
on a set M define the same topology, that is, any open set with respect to d; is open 
with respect to d2, and vice versa. Such metrics are called equivalent. For example, 
each of the metrics 


|m —n| 


d(m,n) = |m — n| and p(m,n) = + (m—nl' 


define on Z the discrete topology and thus, they are equivalent. 


The equivalence of metrics can be formulated in terms of balls by denoting by 
BS a1) (x) and B®) (x), respectively, the balls around x associated to the metrics dj 
and d. Precisely, d; and d2 are equivalent if for every point x, each ball Bi a (x) 
contains a ball B®) (x) and vice versa. Clearly, this statement remains valid if we 
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consider only balls of radii r,s € (0, 1). Using this fact, one can easily prove the 


equivalence of the metrics d(x, y) = |x — y| and p(x, y) = pty on R. 
Exercises 
1. Prove the properties (TOP1)—(TOP3) of the topology associated to a metric. 


2. Prove the properties (V1)-(V4) verified by the family of neighborhoods of a 


12. 


given point. 


. Show that in every metric space, the finite subsets are closed and that the same 


is true for the closed balls or for the spheres. 
[Hint: Use Hausdorff’s separation property to show that their complements are 
open. ] 


. (The duality between interior and closure). Prove that 


C (int A) = CA 
C (A) = int CA) 


for every set A in a metric space M. 


. Prove that the closure and the interior of every bounded subset of a metric space 


are also bounded. 


. Describe the relative topology of T = (0, 1] (the topology of T viewed as a 


subspace of R). 
What is the closure of (0, 1/2) in T? 
What is the interior of [1/2, 1]? 


. Consider the metric space Q (viewed as a subspace of R). 


(a) What is the interior of [0, 1] O Q in Q? What about in R? 
(b) What is the closure of [0, 1] N Q in Q? What about in R? 


. Prove that: 


(a) R is the union of a countable set of mutually disjoint nonempty open intervals 
only when the family reduces to one interval. 

(b) the only subsets of R simultaneously closed and open are @ and R. 

[Hint: Suppose that A is a subset of R such that both A and CA are nonempty 
and open (hence closed). Choose x in A and y in CA. In case we have x < y, 
let z = sup{a:a € A: a< y}. Show that either of the hypotheses z in A or z 
in CA leads to a contradiction.] 


. Compute the interior and the closure in R of the subset {7 inéeZ,n# o} : 
10. 


Prove that AU B = AUB and int (A M B) = int Aint B, and provide examples 
illustrating that AN B A AM B and int(A U B) ¥ int A Vint B. 


. Prove that the open unit square (0, 1) x (0, 1) is not a countable union of pairwise 


disjoint open discs. 

[Hint: Look at the diagonal {(a, x) : 0 < x < 1} and take into account the 
uniqueness part of Theorem 5.2.3.] 

Prove that every closed subset of R is a countable intersection of open sets. 
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13. One can prove that all norms on R? induce the same topology. See [1], p. 16. 
Use Lemma 3.2.1 (with x replaced by x — xq) to show that ||-||1 , ||-I|,. and the 
Euclidean norm on R?, generate the same topology. 


5.3 Convergence in Metric Spaces 


The notion of convergent sequence in an Euclidean space can be easily extended to the 
context of metric spaces. Precisely, given a metric space M = (M, d), a sequence 
(@n)n Of points of M is said to be convergent to the point x of M (abbreviated, 
XL» — x) if for every ¢ > 0, there is a natural number N such that for alln > N, 


d(%y, 2) <€. (5.1) 


The element zx (when exists) is unique and is called the limit of the sequence 
(Xn)n. The limit is often denoted as 


lim 2p. 
noo 


Noticing that the inequality (5.1) can be expressed as 
Ln € Be(x), 


we atrive at the following reformulation of the notion of convergence: 


5.3.1 Theorem (Topological Characterization of Convergence) A sequence (Xn)n 
is convergent to a point x if and only if for every neighborhood V of x, there is an 
index N € N such that for alln > N, we have xn € V. 


Proof Necessity. Suppose that x, — 2 and let V € V,. By the definition of neigh- 
borhood, there is € > O such that B-(x) C V. Since x, — «, there is a natural 
number N such that x, € B-(x) for alln > N. Thus, x, € V foralln > N. 
Sufficiency. Let ¢ > 0. By the hypothesis, for the neighborhood V = B-(x) of x, 
there is anatural number N with the property that x, € V foralln > N, equivalently, 
|Zn — z| < € whenevern > N. 


5.3.2 Corollary If a sequence (%n)n is convergent to a point x and all its terms 
belong to a closed set A, then the limit also belongs to A. 


Proof \f x ¢ A, then the complement of A would be an open neighborhood of z. 
According to Theorem 5.3.1, this forces 2, € CA for n sufficiently large, which 
contradicts the hypothesis. 


Another easy consequence of Theorem 5.3.1 is the fact that every subsequence of 
a convergent sequence is also convergent, to the same limit. 
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A sequence (%,), is called a Cauchy sequence if for every ¢ > 0, there is an 
index N such that for all m,n > N, 


d(%m,%n) < €. (5.2) 


Every convergent sequence is a Cauchy sequence and every Cauchy sequence 
is bounded. Moreover, a Cauchy sequence containing a convergent subsequence is 
necessarily convergent (adapt the argument of Lemma 2.4.4). 

There are metric spaces containing Cauchy sequences without limit. See the case 
of M = (0, 1], endowed with the metric induced by R, and of the Cauchy sequence 
GC) . 

5.3.3 Definition A metric space M is said to be complete if every Cauchy sequence 
in M converges to a point in M. 


As was already noticed, the Euclidean space R” is an example of complete met- 
ric space. By Corollary 5.3.2, every closed subset of a complete metric space is a 
complete metric space relative to the induced metric. Every metric space can be 
embedded in a complete metric space. See Theorem 5.7.1. 


5.3.4 Remark (Convergence in R) The notion of sequence with infinite limit (as 
introduced in Sect. 2.5) can be subordinated to the general theory of convergence in 
metric spaces by considering on R the metric p, that makes the following bijective 
function, an isometry: 


-1 ifx=-oo 
f:R-[-L1), f@m= a ifeeR 
1 if =o. 


Precisely, p is defined by the formula 


pay) =|f@)— FM. 


The fact that p is indeed a metric is obvious. This metric induces on R the Euclidean 
topology. See Remark 5.2.6. Denoting by BY (oo) and By? : (—oo), respectively, the 
balls around oo and —oo associated to p, it is easy to observe that when r € (0, 1), 
then 


B®) (00) = {« 


= fre: 2-1 <2} ufo) = (7 -1.00] 
r r 


and, similarly, 
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Therefore, the topology associated to p (called the natural topology of R) consists 
of all sets of the form 


D, [—oo, —s)UD, DU (t, ov], [—oo, —s) UDU Ct, oo], 
where D can be any open subset of IR and s and ¢ are any positive numbers. 


Since every neighborhood of 00 contains a ball BY ) (co) = (2 —1, co] for some 
r € (0, 1) and every interval of the form (€, oo] is a neighborhood of oo, we infer 
from Theorem 5.3.1 that a sequence (ay), of real numbers has the limit oo if and 
only if for every € > 0 there is an index N such for alln > N, 


Gn > €. 


Similarly, a sequence (a,), of real numbers has the limit —oo if and only if for 
every € > 0 there is an index N such for alln > N, 


Qn < —€é. 


This provides a full motivation for our definition of sequences with infinite limits, 
as given in Sect. 2.5. 

From the point of view of R, the real sequences with infinite limits are convergent 
(but they remain divergent in the context of R, since the limits are outside R). 


Exercises 


1. Using the separation property of the topology of R and the topological charac- 
terization of convergence, prove that a, — a anda, — bimply a= b. 

2. Use the topological characterization of convergence to show that the limit of a 
convergent sequence of nonnegative numbers is nonnegative too. 

3. Suppose that x, — x andy, — yinthe metric space M. Prove that d(x, Yn) > 
d(x, y). 

4. Let M be acomplete metric space. Prove that a dense subset A of M is acomplete 
metric space relative to the induced metric if and only if A = M. Infer that Q is 
not complete relative to the natural metric. 

5. According to Theorem 5.3.1, the convergent sequences relative to two equivalent 
metrics are the same. However, the Cauchy sequences may be different. Show 
this for the pair of metrics on R indicated in Remark 5.2.6. 

6. (Products of metric spaces). If M;) = (Mj, d)) and Mz = (M2, d>2) are two 
metric spaces, then we can put on their Cartesian product M; x M2 a structure 
of metric space by defining the distance between a pair of points x = (21, x2) 
and y = (y1, y2) by the formula 


2 4 1/2 
d(x.y) = (di (ar, yi) + dfla2,m)) 
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Prove that: 

(a) d is indeed a metric on M; x Mp. 

(b) p(x, y) = max {d1 (x1, y1), d(X2, y2)} is a metric on M, x Mp? equivalent 
tod. 

(c) The bounded sets in M; x Mp? are precisely the sets whose projections on M, 
and M> are both bounded. 

(d) The convergence of a sequence (X;,), in M, x M2 is equivalent to the con- 
vergence of its projections on M,; and M) (the analogue of Proposition 3.2.3). 
(e) M, x M2 is complete if and only if both spaces M, and M2 are complete. 
(f) The open sets of the metric topology on M, x M2 are unions (finite or infinite) 
of products U; x U2, where U, is an open set of M; and U2 is an open set of M2. 
(g) If Ey = (£1, |I-||,) and £2 = (£2, ||-||2) are both real normed linear spaces, 
then so is E; x E> with respect to the linear structure defined coordinatewise 


(x1, £2) + (yt, y2) = (@1 + yt, 2 + y2) 
a(x1, €2) = (ax, ax), 


and the norm defined by the formula 


1/2 


Ker, wa)ll = (lel? + Ileal?) 


5.4 Closed Sets and Points of Accumulation 


Our next goal is to investigate the fine connection between the topological nature of 
a set and the behavior of sequences of elements of that set. For this, we shall need a 
new topological notion. 

Let M be a metric space and let A be a subset of it. 


5.4.1 Definition We say that a point a in M is a point of accumulation (or a cluster 
point) for A if every neighborhood V of a contains points of A, distinct from a, 
that is, 


(V\{a}) NA AD. 


The set of points of accumulation of A is also called the derived set of A and 
denoted A’. 


The points of A, that are not points of accumulation, are called isolated points. 
These points admit neighborhoods containing no other points of A. 

It is worth noticing that only the infinite sets may have points of accumulation. In 
fact, if A = {a,,..., Gy} is a finite subset of M and a ¢€ A’, then for every index k 
such that a # a, there are pairs of disjoint neighborhoods U,; € V,, and Vi € Va. 
Clearly, the intersection V = (| V; (over these indices k) represents a neighborhood 
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of a whose intersection with A contains only the point a. This contradicts the fact 
that a € A’. Therefore, a finite subset has no accumulation point. 

In the case of R, all points of a nondegenerate interval are points of accumulation 
(for that interval). Some infinite and unbounded subsets of R (such as N and Z) admit 
no accumulation point and consist only of isolated points. 


5.4.2 Bolzano-Weierstrass Theorem (for subsets of R’) Every infinite and 
bounded subset A of R? admits points of accumulation. 


Proof Since the set A is infinite, it contains a sequence of distinct points. This 
sequence is bounded as the set A itself is bounded. Now, the conclusion follows 
from Bolzano-Weierstrass Theorem for sequences. 


The notion of point of accumulation leads to an important characterization of 
closed sets. 

For this, we start with the remark that all closure points which do not belong to 
the set under attention are necessarily points of accumulation, that is, 


A\ACA’, 
Since A C A, we conclude that 
A=AUA’. 


5.4.3 Lemma /f a is a point of accumulation for A, then each neighborhood of it 
contains infinitely many points of A. 


Proof Suppose, by reduction ad absurdum, that there exists a neighborhood V of 
a such that AN V = {aj,..., Gn}. According to the property of separation, for 
every index k with a, ~ a, there exists a neighborhood V; € VY, that does not 
contain ax. In this way, W = VN (N k Vi) constitutes a neighborhood of a for 
which AM (W \ {a}) = Q@, in contradiction with the fact that a is a point of 
accumulation. 


5.4.4 Proposition Suppose that a is a point of accumulation of a set A. Then, there 
exists a sequence of distinct points of A converging to a. 
Therefore, every closure point of A is the limit of a sequence of points of A. 


Proof Let a be a point of accumulation of A. By Lemma 5.4.3, the ball By (a) 
contains infinitely many points of A. According to Choice Axiom, we can choose a 
point ag € By(a)NA. Anew application of Lemma 5.4.3, shows that the ball B1/2(a) 
contains also an infinity of points of A. Then, we can choose a point a; € By/2(a)NA, 
with a; 4 ag. By continuing the same way, at step n we choose ay, € Byj9n(a) NA 
such that ay € {a0, @1,..., Gn—1}. Clearly, an > a. 


By combining the discussion above with Corollary 5.3.2 we obtain the following 
characterization of closed sets in terms of convergent sequences. 
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5.4.5 Theorem Let A be a subset of a metric space M. Then the following three 
assertions are equivalent: 


(a) A is closed. 
(b) A contains all its points of accumulation. 
(c) If (Gn)n is a sequence of points of A, convergent in M toa, thenae€ A. 


An important technique when dealing with metric spaces, is to approximate the 
elements of a set by elements of a suitable smaller subset. The useful notion in this 
respect is that of dense subset of a set. Precisely, a subset B of A is dense in A if 
A C B. By Theorem 5.4.5, it follows that B is dense in A if and only if every point 
of A is the limit of a convergent sequence of points of B. 

The sets that admit countable dense subsets are called separable. R? and all its 
subsets are separable. See Lemma 1.4.4 for p = 1 and Exercise 5, Sect.3.2, for 
p > 1. On the other hand, every uncountable set endowed with the discrete metric 
is an example of nonseparable metric space. 

In the context of infinite dimensional Banach spaces, the existence of an appro- 
priate sequence whose linear hull is a dense subspace, may lead to powerful conse- 
quences. See Theorem 9.8.1 for an illustration. 


Exercises 


1. Suppose that A is a nonempty subset of R consisting only of accumulation points. 
Can A be countable? 

2. Consider an uncountable subset A of IR. Prove that A contains points x (called 
condensation points) such that every neighborhood of x contains uncountably 
many points of A. 

3. Let A be a nonempty subset of R. Prove that inf A and sup A are closure points 
of A. 

4. Prove that R is a separable space. 

5. Prove that every open subset of R? (and thus of C) is a countable union of open 
discs (as well of closed discs). 

6. What is the closure of 2!(N, R) viewed as a subset of co(N, R)? 

7. Show that each of the spaces co(N, R), c(N, R) and £!(N, R) contains bounded 
sequences with no Cauchy subsequences. 

8. Prove that the Banach spaces co(N, R) and ¢!(N, R) are separable. One can prove 
that 2°°(N, R) is nonseparable. See [1], p. 8, for an easy argument. 

[Hint: Notice that for any integer n > 1, the sequences having the first n terms 
rational numbers and all the others zero form a countable set.] 


5.5 Compact Sets 


The bounded and closed intervals of R play a remarkable property of selection that 
extends to all bounded and closed subsets of the Euclidean space R?: 
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5.5.1 Theorem (M. Fréchet) /f K is a subset of R?, then the following assertions 
are equivalent: 


(a) K is closed and bounded subset; 
(b) Every sequence of points of K contains a subsequence convergent to a point 
also belonging to K. 


Proof (a) implies (b) follows from Theorem 5.4.5 and the Bolzano-Weierstrass 
Theorem (extended to R? in Exercise 10, Sect.3.4). 

(b) implies (a). The fact that K is closed also follows from Theorem 5.4.5. If K 
were not bounded, then no finite union of balls of radius 1, centered at the points of 
K can include K. In other words, for every finite subset F of K, 


K\( Bia) 49. 


Ter 


Fix arbitrarily 7; € K and choose x2 € K\B (a1) that is, d(x, x2) > 1. Repeating 
the argument above, we choose a point #3 in K \ (Bi (x1) U Bi (x2)) , which assures 
d(x1,%3) > 1 and d(#2, x73) > 1. By mathematical induction, we conclude the 
existence of a sequence (x,,), of elements of K such that 


d(xj,xj)=1- forall i < j. 


Such sequence cannot contain any Cauchy subsequence, a fact that contradicts our 
hypothesis. Therefore K is a bounded set. 


Though boundedness is not a topological notion, the class of closed and bounded 
subsets of IR? admits a topological characterization that makes the objective of 
Theorem 5.5.6. 

Let M be a metric space and K a subset of it. An open cover of K is every family 
U =(Uj)j<; of open subsets of M such that K C Uj<, Ui. Choosing for each point 
x € K an open neighborhood V,, of it, the family (Vz)zex provides an open cover 
of K. 

We say that an open cover (U;);<; of K admits a finite subcover if there exists a 
finite subset J of J such that the family (U;);<, still constitutes an open cover of K. 


5.5.2 Definition A subset K of M is compact if every open cover of it admits a 
finite subcover. 


5.5.3 Lemma Every compact subset K of a metric space is closed and bounded. 


Proof We show first that the set K is closed, equivalently, CK is open. 

Let x € CK. By the separation property of the metric topology, for every k € K, 
there exists a pair of open and disjoint neighborhoods, U, of k and Vx of x. Then, 
the family (U;.), constitutes an open cover of K and, according to our hypothesis, 
one can extract a finite subcover U;,,..., Ux,. The set V = Va, 1... Vx, is an 
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open neighborhood of «x and moreover, VM (U7_, Ux,) = ¥. Since K C UL Ue, 
it follows that VN K = G, that is, V CCK. 

We shall show now that K is also bounded. For this, choose an arbitrary point 
k e€ K and consider the open cover of K constituted by the family of balls 
B,(k), Bo(k), B3(k),... . Since this cover admits a finite subcover, there must exist 
a natural number WN such that K C eae Bj; (k) = By(k), and this implies that K 
is a bounded set. 


5.5.4 Lemma Every closed subset of a compact set is compact. 


Proof Let K be a compact set and let S be a closed subset of it. Consider also an 
arbitrary open cover / = (U;);<, of S. Since the complementary set of S is open, we 
can add to the family U/ the set CS to get an open cover of K. Due to the compactness 
of K, the family // must contain a finite subfamily V = (U;);<,, such that V U {Cs} 
is a finite cover of K. This implies that (U;);-, is a finite subcover of S. 


A natural extension of the notion of interval in the context of Euclidean spaces 
R? is that of p-dimensional interval. These intervals are Cartesian products 


R=1,xlx-:-x Ip 


of p usual intervals in R. When p = 2, these intervals are rectangles whose sides 
are parallel to the coordinate axes. In order to avoid any confusion, we will refer to 
the usual intervals in R as real intervals. 

A p-dimensional interval R = I x [2 x--- x Ip (whose all sides J, are nonempty 
sets) is bounded, closed, and open if and only if all real intervals [), I2,..., Ip have 
that property. See Exercise 1. 


5.5.5 Borel-Lebesgue Covering Lemma (also known as the Heine-Borel Lemma) 
Every closed and bounded p-dimensional interval 


R = [a1, bi] x [az, b2] x --- x [ap, Bp] 


of R? is a compact set. 


Proof Using bisection of each of real intervals [a;, by], the p-dimensional interval 
Ro = R can be decomposed into 2? (p-dimensional) subintervals, each of which 
having diameter equal to half the diameter of R. If 7/ is an open cover of R that admits 
no finite subcover, then one of these subintervals, say R;, also fails the existence of a 
finite subcover. By repeating the procedure of bisection, we obtain a nested sequence 
of p-dimensional intervals 


R=RoDRiDR]D:-:-, 


such that diam R, = 2~” diam Ro, and R, admits no finite subcover with sets in U/. 
Assuming that 
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R, = [a”, i) x fax”, bY) Xo xX fal”, on for everyn € N, 


we infer from the Nested Intervals Lemma, that each of the p intersections 
(icalee”. eal is nonempty, which implies that (Va R, is nonempty too. Sup- 
pose z belongs to the intersection of all p-dimensional intervals R,. Since z € Ro, 
it lies in an open set U € U. Since diam R, — 0, U must contain a p-dimensional 
interval Ry, so {U} will provide a finite subcover of Ry, a contradiction. 


5.5.6 Theorem (The structure of compact subsets of R?) A subset of R? is compact 
if and only if it is closed and bounded. 


Proof The fact that every compact subset is closed and bounded was noticed in 
Lemma 5.5.3. Conversely, if K is a closed and bounded subset of R?, then it is also 
a closed subset of a suitable bounded and closed p-dimensional interval R, and the 
later is compact according to Borel-Lebesgue Covering Lemma. The proof ends by 
appealing to Lemma 5.5.4. 


According to Theorem 5.5.6, all closed balls and all spheres of the Euclidean 
space R? are compact sets. Since R? and C coincide as metric spaces, from Theorem 
5.5.6, we infer that a subset of C is compact if and only if it is bounded and closed. 
In particular, the closed unit disc D = Dj (0) and the unit circle S! in C are compact 
sets. 

None of the Euclidean spaces R? is compact. However, these spaces are locally 
compact in the sense that each point has a compact neighborhood. 

A criterion of compactness in the framework of metric spaces makes the objective 
of Exercise 9. 

When a subset S of a metric space is contained in a compact subset, then S is called 
relatively compact subset. By Lemmas 5.5.3 and 5.5.4, a set S is relatively compact 
if and only if its closure is compact. An inspection of the argument of Theorem 5.5.1 
yields the following result: 


5.5.7 Proposition A subset S of R? is relatively compact if and only if it is bounded. 


An interesting example of a compact set is Cantor’s triadic set A, which we 
already encountered in the proof of Corollary 1.7.2 (concerning the uncountability 
of the unit interval [0, 1]). The construction of A starts by eliminating from the 
compact interval Po = [0,1] the open middle third subinterval, O;,; = (. 5). 
The remainder set P;, is the union of two compact intervals [0, 5] and [. 1]. By 
applying to each of them the previous procedure of elimination of the open middle 
thirds (that is, of Ox1 = (33, 4) and Ox = (4, S), each of length 1/3rd of the 
length of the interval from which is eliminated), the remainder set P> is the union of 
2 compact intervals, [0, a Lz, 3], [. wl, Lz, 1], each of length zr At the next 
step, we eliminate 22 open middle third subintervals, 031, 032, 033, O34, remaining 
a compact set P3 which consists of 2 compact subintervals and so on. The set A is 
the intersection of all these compact sets P,,. Clearly, 
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is a compact Set; 

includes no nondegenerate subinterval; 

consists only of points of accumulation (it contains all endpoints of the intervals 
eliminated, and these points approximate all other points of A). 


>DD 


Surprisingly, the above three properties characterize A from topological point of 
view. See Theorem 6.7.3. 

Attached to the construction of Cantor’s triadic set is the Cantor-Lebesgue singular 
function described in Exercise 4, Sect. B2, Appendix B. 

The set A is uncountable and this fact is a consequence of the so-called triadic 
representation of its elements: 


5.5.8 Theorem Each element x of Cantor’s triadic set admits a unique representa- 
tion of the form 


| 


WwW 


oo 
x. 
=> 63 
n=1 


with coefficients Xp in {0, 2}. 


Proof Notice first that every number z in [0, 1] admits a representation of the form 
(5.3), with x, € {0, 1, 2} for each index n; possibly, this representation is not unique. 
See Sect.1.5. The numbers in the first interval eliminated, (1/3, 2/3), are those 
numbers « of [0, 1] that admit only representations (5.3) with x; = 1. The number 


1/3 admits also the representation "~_, < and, in general, the numbers 


Z] Tm-1 1 
= 37 oe 3gm-1 3m 
with 41,..., Zm—1 € {0, 2} admit also the representation 
CO 
Tr} Tm—1 0 2 
3° + 3m—t + 3m 7 » 3n 
n=m+1 


The numbers eliminated at the second step of the construction of A, that is, those 
belonging to (5. ) U (, 8) , are the numbers left after the first step that admit 
only triadic representation with x2 = | and so on. Finally, let us remark that the 
representation (5.3) is unique. In fact, if 


with £n, Yn € {0, 2} for every index n, then necessarily x, = y, for every n. 
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5.5.9 Corollary The Cantor set A has cardinality c. 


Proof One applies the Cantor-Schréder-Bernstein Theorem. In fact, the natural inclu- 
sion i : A — [0, 1] provides an injective map from A to [0, 1]. The map 


= n 1 = n 
f: 4 > (0,1), AX )=5 x 


n=1 n=l 


induced by the triadic representation of Theorem 5.5.8 is surjective. According to 
Choice Axiom, we can choose for each x € [0, 1] an element y in f —l(z), which 
provides an injective map from [0, 1] to A. 


As we will see in Sect. 6.7, Cantor’s triadic set can be used to encode every compact 
subset in a metric space. 


Exercises 


1. Consider p nonempty real intervals /), I, ..., Ip. Prove that the p-dimensional 
interval R = I, x In x --+ X Ip is bounded, closed, or open if and only if all 
real intervals I), I2,..., Ip have the respective property. 

2. Prove that the metric space IR is compact. 

3. Let (a@n)n be a convergent sequence in a metric space and consider the set con- 
stituted by the terms of the sequence and its limit. Prove that this set is compact. 
Is the converse true? 

4. (The one-point compactification of N). This exercise is an application of the 
preceding one. Consider the set N=Nu {coo} endowed with the metric induced 
by R. Prove that: 

(a) the induced topology on N is the discrete one; . 

(b) every open set containing oo includes the complement (in N) of a finite subset 
of natural numbers; 

(c) Nisa compact space. 

5. Prove that every compact metric space is separable. 

[Hint: Consider the centers of finite coverings by open balls of radii 2~”, for 
neéeN,] 

6. Let/ be an open cover of the closed unit disc D,(0) inC. Prove that / contains 
a finite subcover of D, (0) which also covers an open disc D,(0) with r > 1. 

7. (Bolzano-Weierstrass Theorem for Compact Metric Spaces). Suppose that M is 

a compact metric space. Prove that every sequence (x,), of elements of M has 
a convergent subsequence. 
[Hint: The case where (7), has only finitely many distinct terms is clear. Sup- 
pose now that (x,), has a subsequence (y,), of distinct terms. According to 
Proposition 5.4.4, it suffices to prove that A = {y, :n € N} has an accumula- 
tion point. If no such point exists, then A is closed and for each n, there exists 
an open neighborhood U,;, of y, such that A and U,\{yn} do not intersect. This 
gives rise to an open cover of M (consisting of CA and all the sets U;,) that has 
no finite subcover. | 
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8. Infer from the preceding exercise that every compact metric space is complete. 
9. (The converse of the Bolzano- Weierstrass Theorem for Compact Metric Spaces). 
Suppose that M is a metric space and every sequence of elements of M has a 
convergent subsequence. 
(a) Prove that for every open cover U/ = (U;);<, of M, there is a strictly positive 
number r such that for every x € M the open ball B,(x) is included in one of 
the sets U;. 
(b) Prove that M is complete and totally bounded (that is, for every ¢ > 0 there 
are finitely many points x7), %2,..., %, in M such that M = Ure B-(xx)). 
(c) Use the preceding two results to conclude that M is compact. 
[Hint: (a) Assuming that the contrary is true, choose a sequence (%,), such that 
Bj 2" (Xn) is not included in any U;. According to our hypothesis, a subsequence 
(Tk(n))n 18 Convergent, say to x. Then x belongs to a set Uj(,), which should 
include a ball By/2" (xp). For (b), adapt the proof for the implication (b) > (a) 
in Theorem 5.5.1.] 
Notice. Hausdorff’s criterion of compactness asserts that a metric space is com- 
pact if and only if it is complete and totally bounded. See [2], p. 251. 

10. (Lebesgue’s Number Lemma). Suppose that K is a compact set in a metric space 
and F is an open cover of K. Prove that there exists a positive number p such 
that every open ball of radius p, centered at a point of K, is contained in some 
single member of F . 


5.6 Baire Category 


In the topological framework, the sets can be compared by density. This led R. Baire 
to a classification of sets and also to a topological concept of negligibility. 

Let M be a metric space. A subset X of M is said to be nowhere dense if the 
interior of its closure is the empty set. It is said to be of first Baire category if X is 
a countable union of nowhere dense sets. All other subsets of M are said to be of 
second Baire category. 

Clearly, the countable subsets of R and the subsets of Cantor’s triadic set are sets 
of first Baire category. Their complements in R are of second Baire category. More 
elaborated examples will come through the Baire Category Theorem. The proof of 
that theorem depends on the following extension of Nested Intervals Lemma: 


5.6.1 Cantor’s Lemma Suppose that M is a complete metric space. Then, the 
intersection of every nested sequence (Cy)n of nonempty closed subsets of M such 
that lim diam(C,;,) = 0 is nonempty and consists of exactly one point. 

n—-> Oo 


Proof For eachn € N, choose x, € Cy. Since 


d(&n, Lp) < diam(C;) 
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for alln < p and diam(C,) — 0 as n — on, it follows that the sequence (x) is 
Cauchy. Let x = lim 2,. For each n, the set C, is closed and 
n—-oo 


Lp € Cy whenever p =n. 


Hence x € C, and so, x € (eo C,,. The intersection cannot contain any other point 
y since 0 < d(x, y) < diam(C,,) for every n, which forces d(x, y) = 0. 


5.6.2 Baire Category Theorem Suppose that M is a complete metric space. Then, 
the complementary set of each first Baire category subset X is dense. As a conse- 
quence, M is of second Baire category (into itself). 


Proof Suppose that X = U2.) An, where each A, is nowhere dense in M. Without 
loss of generality, we may assume that each A, is closed. 

Let V be an arbitrary nonempty open set. We will show that VN CX 4 9. 

Let Uo be any open ball of radius less than or equal 1, included in V. Since Up 
is not a subset of Ag, it follows that Ug NCAp is a nonempty open set. Choose U; 
an open ball such that U; C Up NC Ao and diam(U}) < 1 /2. Suppose that the open 
balls Uo, ... Un have been chosen such that Uz41 C URN CA, and 


— 1 
di Ux < — 
jiam(Uz41) < k+l 


for 1 <k<n—1.Then U, NCA, isa nonempty open set, so there is an open ball 
Un+1 such that Un4, C Un CA, and diam(Un+1) < 1/(n + 1). We thus obtain a 


nested sequence (On) of nonempty closed balls with diam(U,,) — 0. Since M is 
complete, there must exist x € M such that (ice U, = {x}. Then 


[o,e) [o,2) 


é(\ Une C UN) CA, 
n=0 n=0 


cvn(Us)=vinex 


n=0 


Since V was arbitrary, it follows that CX is dense in M. 


A subset of a metric space is called a G5 set if it is a countable intersection of open 
sets and is called a F, set if it is a countable union of closed sets. The complement 
of a G5 set is an F, set and vice versa. 

A special type of G5 sets are the residual sets, which are the countable intersections 
of dense open sets. By looking at the complementary set and taking into account the 
formula 

CA = int (CA) 


(see Exercise 4, Sect.5.2) we infer that every residual set in a complete metric space 
is of second Baire category. Consequently: 
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5.6.3 Proposition Every residual subset of a complete metric space is dense. 


Q is an F; set. In fact, Q is a countable union of single point sets (each of them 
being closed). 

Q is not a G5 set (and a fortiori it is not a residual set). Assuming that 
Q= (is Dn, where each D,, is an open set in R, then each CD, is a nowhere 
dense set since int0D, = CD, = %. Thus, 


R=QuMo=0U(U eD.). 


n=0 


which contradicts the fact that R is of second Baire category. 

In the topological framework, a property P relative to the elements of a set is 
called generic if the subset of all elements having that property is a residual. By the 
discussion above, irrationality is generic among the real numbers. 

A generic property regards a large part of the entire set, suggesting that “most” 
elements have the property P. However, this should be regarded with some cau- 
tion since a generic set can be negligible from other points of view. For example, 
the set £ of Liouville numbers (introduced in Sect. 1.7) is a residual in IR, but for 
each ¢ > 0 there exists a sequence (J,,),, of intervals such that £ C Ue hn and 


yr LUn) < €. See Exercise 2, Sect. 9.6. 


Exercises 


1. Consider R endowed with the discrete metric. Prove that R is complete and 
describe all its subsets of first Baire category. 

2. Prove that every nondegenerate interval is a set of second Baire category. 

3. Prove that is not possible to partition R into a countably infinite union of nonempty 
closed subsets. 

[Hint: Assuming that R = UJ, C(n) is such a partition, construct a sequence 
of nested intervals [a,, b,] such that C(n) O [ay, by] = % for all n.] 

4. (Convergence is not generic). Let °° (N, IR) be the Banach space of all bounded 
sequences of real numbers. Prove that c(N, IR), the vector subspace of €° (N, R) 
consisting of convergent sequences, is a nowhere dense closed set. 

5. Prove that: 

(a) c(N, R) is separable. 
(b) The closed unit ball of c(N, R) is not a compact set. 


5.7 Notes and Remarks 


The theory of metric spaces was initiated by Maurice René Fréchet in 1906, in 
an attempt to unify Cantor’s theory of point sets and the treatment of functions 
as points of a space. In a famous book published in 1914, Felix Hausdorff added 
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many important results to this theory (and founded the general theory of topological 
spaces). One such result is the completion of metric spaces: 


5.7.1 Theorem For each metric space M = (M, d), there exists a complete metric 
space M = (M, d) and an isometry iy from M to M such that iy(M) is dense in M. 

Moreover, the space M is unique in the sense that if X = (X, p) is a complete 
metric space and f is an isometry from M to X such that f (M) is dense in X, then 
there is an isometry g from M onto X such that 


f =goim. 


We call the pair (M, ij) the completion of M. 


Proof (A sketch) The proof of Theorem 5.7.1 is similar to Cantor’s construction of 
R as the completion of Q. It starts by considering the set €, of all Cauchy sequences 
in M, endowed with the following relation of equivalence, 


(Zn)n ~ Yn)n tf and only if d(@p, Yn) > 0. 


The set M = €/ ~ of equivalence classes becomes a metric space relative to the 
metric 


d(a, —~Y= jim. d(Xns Yn); 


where (2p)n € a and (yp), € G@. The fact that this limit exists follows from the 
quadrilateral inequality. See Exercise 2, Sect. 5.1. One can show easily that d(a, 3) 
is well-defined (that is, it does not depend on the representatives of a and (3) and that 
the map d is indeed a metric. 

The mapping iy : M — M that associates to each x € M the equivalence class 
(x)n (of the constant sequence (x),,) provides an embedding of M into M. 

Every element (nn of M can be approximated by elements of M, more precisely 
by the equivalent classes associated to the constant sequences (x p),. Indeed, 


AC @adas (p)n) = im d(Xn, Lp). 


If (ag) x is a Cauchy sequence in M and a, is the equivalence class of (ak) n, then 
(x"\)n is a Cauchy sequence in M and 


Ak > Qa, 
where q is the equivalence class of (”),. This yields the completeness of M. 


The mapping g : M — X which appears in the second part of Theorem 5.7.1 is 
defined by the formula 


g (Gon) = lim f(x). 


The above argument also works in the context of normed linear spaces. 
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5.7.2 Corollary For each normed linear space E, there exists a Banach space E 
and a linear isometry if from E to E such that ig(E) is dense in E. 

Moreover, the space E is unique in the sense that if F is another Banach space 
and f is a linear isometry from E to F such that f (E) is dense in F, then there is a 
linear isometry g from E onto F such that 


f= 9olg. 


An interesting account on the history of compactness is due to Sundstr6m [3]. 
The definition of a compact set as given in our text was provided by Pavel Sergeye- 
vich Alexandrov and Pavel Samuilovich Urysohn in 1923 and was based on Borel- 
Lebesgue Covering Lemma. Emile Borel proved in 1895 that any open cover of an 
interval [a, b] admits a countable subcover. The existence of finite subcovers was 
proved independently by Pierre Cousin (1895) and Henri Lebesgue (1898). It is worth 
noticing that the proof of Theorem 5.5.5 follows closely the argument of Cousin. 

The p-dimensional intervals [a,, b1] x [a2, b2] x--- x [a@p, bp] of R? represent a 
clear generalization of compact real intervals to higher dimensions. Is that the unique 
way? 

Since an interval [a, b] is a closed ball, the closed balls in R? also provide a 
good analogue of compact real intervals. In the case of infinite dimensional normed 
linear spaces, the closed balls cannot be compact. This result due to Frigyes Riesz is 
available in [1], p. 17. 

An essential property of intervals is convexity. This notion can be extended to the 
framework of normed linear spaces (and even beyond them) by saying that a subset 
C of such a space is convex provided that 


(—-—a)x+ayeC forall z,y¢Cand ae [0,1]. 


The compact convex sets in normed linear spaces play an important role in the 
applications of mathematics to economics. 

Cantor mentioned the set bearing his name in a paper from 1883, as an example 
of perfect set which is nowhere dense. See Georg Cantor, Uber unendliche, lineare 
Punktmannigfaltigkeiten V, Math. Ann. 21 (1883), 545-591. However, before Cantor, 
some other people considered basically the same set: Henry John Stephen Smith 
(1874), Paul du Bois-Reymond (1880) and Vito Volterra (1881). 

Cantor’s triadic set A is an example of set of Lebesgue measure zero. The precise 
meaning of this term is given in Definition 9.6.1, but intuitively is easy to explain why 
A is “negligible”. Indeed, A is obtain as the intersection of a sequence of compact 
sets P,, each being the union of 2” intervals of length qr Since the measure of P,, 


is (2)" , the measure of A should be less than or equal to inf {(3)" ine N| = 0. 


Cantor’s triadic set has zero topological dimension, that is, every open cover of it 
has a refinement consisting of disjoint open sets so that any point in A is contained 
in exactly one open set of this refinement. Here by a refinement we mean a second 
open cover such that every set of the second open cover is a subset of some set in the 
first open cover. For the general concept of topological dimension, the reader may 
consult the excellent book of Edgar [4]. 
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We pass now to the computation of Hausdorff dimension of Cantor’s triadic set. 
Let s € [0, 00). For each ¢ > 0 and each subset X of a metric space M, we set 


[o.@) 
HS (X) = inf SJ Giam Uy, 
k=0 


where the infimum is taken over all sequences (U,,), of open subsets of M such that 
XC I car U, and diam U, < ¢ for all n. Since H(2(X) is increasing as ¢ decreases, 
we define the s-dimensional Hausdorff measure of X by the formula 


HS (X) = lim H3(X), 
eE>0+ 


If H(A) < ooandt > s, then H'(A) = 0. This allows us to define the Hausdorff 
dimension of X as 


dimy(X) = inf {s € [0, 00) : H°(X) = 0}. 
5.7.3 Theorem The Hausdorff dimension of Cantor’s triadic set A is 


. log 2 
dimy(A) = tee : 


Proof Recall that A is the intersection of a decreasing sequence of closed subsets 
Po > Pi} D P2 D..., obtained from Po = [0, 1] by removing successively the 
open middle thirds. Thus, P, consists of 2” closed intervals, each of length 3~” and 


this yields 
1 Ss 2 n 
S n _ 
rea) = (5) = (5) 
log 2 


whenever 3~” < ¢. Then H°(A) < lim (2)" , whence dimy (A) < los3* 
n— Ooo 
In order to prove the other inequality we need the following elementary remark: 
If rj,...,%, > OandO <s5 < 1, then 


> ay t---+a%, implies 2 > a} +---+a). 


Given an open cover U/ of A, it constitutes an open cover also for P,, (starting 
from some rank n on). According to the precedent remark we have 


1 log 2/log3 
> Giam(U))'°87/183 > 2". (=) = 1, 
UU 
and thus dimy (A) > wee In conclusion, dimy (A) = oe 
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The sets whose Hausdorff dimension is different from their topological dimension 
are usually called fractals. An example is provided by Cantor’s triadic set. 

Another remarkable property of A is that for any number in the interval [—1, 1] 
there exists a pair of numbers from the Cantor set whose difference is exactly that 
number: 

A-—-A={x-y:2,y€ A}=[-1, 1]. 


For details, see the book of Gelbaum and Olmsted [5], pp. 86-87. 

Baire [6] was among the analysts active at the turn of the twentieth century, who 
used the ideas of set theory to solve problems in mathematical analysis. Later on, 
Stefan Banach and his colleagues in Lvov, outlined the prominent role of the Baire 
Category Theorem in Functional Analysis. A thorough introduction to Functional 
Analysis is offered by the book of Rajendra Bhatia [1]. 

Metric spaces represent a great source of important topological spaces, but some 
interesting topological properties escape their framework. That is why, we end this 
section with a glimpse of topological spaces in full generality. 


5.7.4 Definition A topological space is any set T endowed with a family J of 
subsets (called the topology of T) that satisfies the following three properties: 


(TOP1) % and T belong to T. 
(TOP2) Any union of elements of T is an element of T. 
(TOP3) Any intersection of finitely many elements of T is an element of T. 


The elements of a topological space T are called points and the sets belonging 
to T are called open sets. Following the model of metric spaces, one can ascribe to 
each topological space the concepts of closed set, neighborhood, interior, closure, 
point of accumulation, open cover, compact set and so on. A topological space in 
which every pair of distinct points have disjoint neighborhoods is called a Hausdorff 
space (equivalently, we say that its topology is separated). 

Every nonempty set T admits at least two topologies, the discrete topology, con- 
sisting of all subsets of T, and the indiscrete topology, that consists only of J and T. 
The discrete topology coincides with the topology associated to the discrete metric. 
However, when 7 contains at least two elements, the indiscrete topology does not 
satisfy Hausdorff’s separation property and thus is not associated to any metric. 

A natural way to generate a topology on a set T, having a family S of subsets of 
it, is to consider the intersection J (S) of all topologies 7 on T having the property 
that 7 D S; one such topology 7 is P(T), the power set of T. We call T(S) the 
topology generated by S. 

Theorem 5.3.1 allows us to extend the notion of convergence. A sequence (27) 
of points of a Hausdorff space T is said to be convergent to a point x (abbreviated, 
XL, — x) if for every neighborhood V of x, there is an index N such that for all 
n>N, 

tEV. 


Due to the separation property, the point x is unique. It represents the limit of the 
given sequence. 
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Unfortunately, in the general context of Hausdorff spaces, the concept of con- 
vergent sequence is not appropriate to characterize the closed sets and the points of 
accumulation (as we did in Sect.5.4). The solution is the theory of convergence of 
nets (known as the Moore—Smith convergence), which is presented by Kelley [7]. 

An example of separated topology on R, different from the Euclidean topology, 
is provided by the family 7 of all subsets of R which are arbitrary unions of sets of 
the form 

fe} U (ONG —2, 2-8), 


where x € R ande > 0. Clearly, J makes R a Hausdorff space and Q is dense in 
R in this topology. Notice that R\Q is not separable. Thus, (IR, 7) is an example 
of separable Hausdorff space with a nonseparable subspace. More information on 
topological spaces may be found in the books of John L. Kelley [7], Lynn A. Steen 
and J. Arthur Seebach [8] and Stephen Willard [9]. 
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Chapter 6 
Continuous Functions 


The previous chapters outlined the important role played by the convergent sequences 
in their relationship with the topological theory of metric spaces. This makes equally 
important the continuous functions acting on such spaces, that is, the functions f : 
M — N with the following convergence preserving property: if a, — a in M, 
then f(an) — f(a) in N. As will be shown in this chapter, continuity represents a 
powerful tool to answer a broad range of qualitative questions that appear in scientific 
computation. So are the existence of solutions of a variety of equations such as 


x +/x + 1000sin7x —e = 0, 


and the existence of extrema of oscillating functions like 


sinx  , 
— ifx4#0 
fea _ * 
i ifx =0. 


6.1 The Notion of Continuity 


In what follows, we will consider functions f : A > B, acting on nonempty subsets 
of R. Typically, A is a nondegenerate interval and B = R. We would like to discuss 
the local behavior of these functions around the different points in their domain of 
definition. The basic notion here is that of continuity. 


6.1.1 Definition We say that f is continuous at the point a (equivalently, a is a 
point of continuity for f) if for every c > 0, there is 6 > 0 such that for every x € A 
with |x — a| < 6, we have | f(x) — f(@| <e. 
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The number 6 depends in general on ¢, the point a, and also on the function /f. 
Characteristic to the points of continuity is the fact that small perturbations of the 
argument lead to small perturbations of the value of the function. 

A function f : A > B is said to be continuous if it is continuous at all points of 
its domain. 

All elementary functions are continuous. This is detailed in the next chapter, 
devoted to a systematic presentation of these functions. 

The property opposite to continuity is discontinuity. It is worth noticing that the 
problem of continuity/discontinuity makes sense only for points which belong to the 
domain of the function considered. 

A large class of continuous functions is that of Lipschitz functions. A function 
f :A— Ris called Lipschitz (that is, it verifies the Lipschitz condition) if there is 
a constant L > 0 such that 


lf(x) — f)|<L|lx—y| forallx,yeA. 


Two simple examples of such functions on R are the affine functions and the 
absolute value function. The sine function and the cosine function are also Lipschitz 
functions on R. See Corollary 7.4.8. Other functions like the polynomials of degree 
greater than |, the exponential and the logarithm are only locally Lipschitz (that 
is, Lipschitz on every compact interval included in the domain). This can be easily 
proved using differential calculus. See Theorem 8.3.5. 

The usual operations with functions (like addition, multiplication by real numbers 
and composition) allow us to exhibit more examples of (locally) Lipschitz functions. 

The square root function f(x) = ./x is continuous on [0, co), but not Lipschitz 
on any interval that contains the origin. 

The characteristic function of Q, 


_f1 ifxeQ 
f=) ifxEeR\Q, 


known as Dirichlet’s function, is discontinuous at every point x € R. 

Notice that the problem of continuity of a function f : A — B is equivalent with 
the problem of continuity of f viewed as a function with values in R (that is, of 
igo f, where ig is the inclusion of B into R). Thus, by restricting the codomain of f 
to any set including f(A), the property of continuity/discontinuity of f at a certain 
point in its domain does not change. 

The definition of continuity at a point can be reformulated in topological terms: 


6.1.2 Proposition (The topological characterization of continuity) A function f : 
A — R is continuous at a point a if and only if for every neighborhood V of f (a), 
there is a neighborhood U of a such that f(U) C V. 


In other words, the function f is continuous at a if and only if the inverse image 
of every neighborhood of f(a) is a neighborhood of a. 
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Proof The property of continuity means that for each ¢ > 0, there is 6 > O such that 
x €(a—6,a+6)NA implies f(x) € (f(a) —«, f(a) +8). 

Thus, for V = (f(a) —«, f(a) + €), we can find U = (a—6,a+6)NA, which 
is a neighborhood of a in the relative topology of A. This suffices to complete 


the proof because in general, every neighborhood V of f(a) is a set containing an 
€ -neighborhood of it. 


A consequence is the following: 


6.1.3 Theorem Let f : A > R. The following assertions are equivalent: 


(a) fis continuous. 
(b) The inverse image of every open Set is an open Set. 
(c) The inverse image of every closed set is a closed set. 


Proof The fact that (a) and (b) are equivalent follows from Proposition 6.1.2 and the 
remark that the open sets are precisely the sets that are neighborhoods for each of 
their points. 

(b) implies (c) is a set theory fact. If F is a closed set, then F = CD, for some 
open set D. Thus, 


fm = fp) =Cr ww) 


is also a closed set, being complementary to an open set. 
(c) implies (b) goes exactly in the same way. 


6.1.4 Corollary Let f : A — R be acontinuous function and let a, b,c € R. Then, 
the following sets 


{xEA: f(x) <c}, {x A: f(x) >ch, {x Asa < f(x) <d} 
are open while 
{xe A: f(x) <c}, {xe A: f)=ac},{xeA:ax fx) <d} 
and 
{xe A: f(x) =c} 


are closed. 


The property of continuity can be formulated also in terms of convergent 
sequences: 
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Fig. 6.1 Heaviside’s function (or the unit step) 


6.1.5 Theorem (Heine’s characterization of continuity) A function f from A to R 
is continuous at a point a if and only if it maps every sequence (dy) of elements of 
A that converges to a, to a sequence (f (ay))n that converges to f (a). 


Proof Suppose that the function f is continuous at a and that a, — ain A. We 
will show that f(a,) — f(a). Let V be a neighborhood of f (a). According to 
Proposition 6.1.2, the set U = f—!(V) is a neighborhood of a and since a, — a, it 
follows that a, € U starting with some index N. Thus, f(a,) € V for alln > N, 
whence f(a,) > f(a). 

Suppose now that f satisfies the condition of mapping the convergent sequences 
into convergent sequences but f is not continuous at a. Then, there is a neighborhood 
V of f(a) such that f(U) is not included in V, whenever U is a neighborhood of a. 
Thus, for each natural number n, there is apointay, € U, = {x € A:|x —a| <2}, 
such that f(a,) ¢ V. Hence, a, — a, but (f(a,)), does not converge to a, which 
is a contradiction. O 


Next, we will discuss operations with continuous functions. First, we will consider 
the composition of functions: 


6.1.6 Lemma Let f :A — Randg:B — Robe two functions such that f (A) C B, 
f is continuous at a and g is continuous at f (a). Then, go f is continuous at a. 


Proof In fact, according to Heine’s characterization of continuity, aq, — a in A 
implies f(a,) — f(a) in B, which in turn yields g(f(an)) > g(f(a)) in R. 
Hence, go f is continuous at a. 


Lemma6.1.6 implies that the absolute value of every continuous function is a 
continuous function. 

It also shows that the restriction of a continuous function is continuous. Simple 
examples (such as the characteristic function of the interval [0, 00)), show that from 
the continuity of the restriction f|4 (of a function f to a set A that contains a), we 
cannot infer the continuity of the function f at a. See Fig. 6.1. 

A case when this possibility occurs is presented in Exercise 3. 

We consider now the algebraic operations with continuous functions. They are all 
immediate consequences of Heine’s characterization of continuity: 
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6.1.7 Lemma Let f,g : A — R be two functions continuous at a point a and let a 
and (3 be two real numbers. Then, the functions a f + Bg and f g are continuous at a. 


6.1.8 Remark Denote by Cp(A, R) the set of all continuous functions f from A to R. 
According to Remark 1.2.9 and Lemma 6.1.7, this set is a commutative algebra with 
unit (the constant function identically 1) when endowed with the pointwise opera- 
tions. As already noticed, 


f € Cp(A, R) implies | f| € Cp(A, R), 


so that C,(A, R) is a linear lattice of functions. In particular, if the function f is 
continuous, then so are f* and f~. The natural norm on C;(A, R) is the supremum 
norm, 


If lloo = sup | f()|, 
xeEA 


that makes Cp(A, R) a Banach space. We will come back with details in Sect. 6.9. 
The continuity of quotients of continuous functions follows from the following 
lemma: 


6.1.9 Lemma [fthe function f : A > Ris continuous at the point a and f (a) # 0, 
then f differs from 0 on a neighborhood U of a and the function 1/f (that is defined 
at least on U) is continuous at a. 


Proof We will prove only the first part of Lemma 6.1.9, noticing that the second part 
follows easily from Heine’s characterization of continuity and Theorem 2.1.10. 
Assuming (by reductio ad absurdum) that in every neighborhood of a, there exist 
points at which f vanishes, we infer the existence of a sequence of points a, € A 
such that |a, — a| < 1/n and f(a,) = 0 for alln > 1. Then a, — a, so by Heine’s 
characterization of continuity, we get f(a) = jim. Ff (Gn) = 0, which contradicts 


the hypothesis. 


6.1.10 Lemma [/f the function f : A — R is continuous at a point a and verifies 
f(a) => C > 0, then f > C/2 ona suitable neighborhood of a. 


Proof According to Theorem 6.1.3(b), the set U = f—! ((C/2, 00) is an open neigh- 
borhood of a, and it is clear that f(x) > C/2 for every x € U. 


The operations with continuous functions prove the continuity of most of the 
usual functions. For example, all polynomial functions with real coefficients are 
continuous functions on R; also, the rational functions (that is, the quotients of 
polynomial functions) are continuous on their domains of definition (which is R 
without the roots of the denominator). 


6.1.11 Definition A homeomorphism is a bijective function f : A — B such that 
both f and its inverse f~! are continuous. Two subsets A and B of R are called 
homeomorphic if there is a homeomorphism f : A > B. 
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For example, ifa < b, then every interval [a, b] is homeomorphic with the interval 
[0, 1]. A homeomorphism is provided by the function 


x a 


f:la,b]> 10,1], fe) =p— 


In the same way, all open intervals (a,b) are homeomorphic with the interval 
(0, 1). It is worth mentioning that actually all (nonempty) open intervals are home- 
omorphic to each other. For example, a homeomorphism between R and (—1, 1) 
(respectively, between (0, oo) and (0, 1)), is provided by the function 


«)=— 
x)= : 
pie Aiea 

The intervals of a different topological nature (such as (0, 1] and [0, 1]) cannot 
be homeomorphic. See Exercise 5 at the end of Sect. 6.4. 

The composition of two homeomorphisms is a homeomorphism and so is the 
inverse of a homeomorphism. The homeomorphisms map (and take back) open sets 
into open sets, closed sets into closed sets and convergent sequences into convergent 
sequences. They allow us to identify (from the topological point of view) the spaces 
on which they are acting. 


Exercises 


1. Prove Heine’s characterization of continuity by using directly the definition with 
€ and 6 of the continuity at a point. 

2. Let a € (0, 1]. We say that a function f : A > Risa Lipschitz function of order 
a, if there is a constant C > O such that 


|f(x) — fQ)| < Clx — y|* forallx, ye A. 


(a) Prove that |x|° is such a function on R and that it is not Lipschitz except when 
a=, 

(b) Prove that the set Lip* (A, R), of all Lipschitz functions of order a, is a 
linear space. 

3. Let A and B be two closed subsets of R and let f : AU B > R be a function 
whose restrictions to A and B are both continuous. Prove that the function f is 
continuous. 

4. Let f : A > R bea function defined on a subset of R. Prove that f is continuous 
at every isolated point of A. 

5. (No analogue of Cantor-Schréder-Bernstein Theorem in the topological frame- 
work). Let X = {x : x is rational and x < 0} U {2” : ne N} and Y = X U {0}. 
Prove that there exist bijective continuous maps from X to Y and from Y to X, 
though X and Y are not homeomorphic. 

[Hint: A bijective continuous map f from X to Y is f(x) = xifx < 1, f(1) = 0, 
and f(x) = x/2 if x > 1. A bijective continuous map g from Y to X can be 
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defined starting with g(y) = y — 1 if y < 1. The unmatched points of Y are all 
isolated and we can extend g by using any bijective map of the unmatched points 
of Y onto the unmatched points of X.] 

6. (A generalization of the functions |x|, x~ and x*). Let C be a nonempty closed 
subset of IR. The distance of a point x € R to the set C is defined by the formula 


d(x, C) = inf {|x —y|: yeC}. 


Prove that: 
(a) d(x, C) = Oif and only if x € C; 
(b) |d(x, C) — d(y, C)| < |x — y| forallx, y ER. 
7. (Urysohn’s Lemma). Let K; and K2 be two nonempty, disjoint and compact 
subsets of R. Prove that there is a continuous function f : R —+[0, 1] such that 


flk, =0 and f|x, =1. 


[Hint: Take f(x) = d(x, K1)/(d(x, Ki) + d(x, K2)).] 

8. Consider the tent map T (x) = 1 — |1 — 2x|, for x € [0, 1]. Prove that for every 
sequence a = (dy)nez, of real numbers, the function Ra(x) = a),|T ({x}) is 
continuous on R. When is this function Lipschitz? 


6.2 Limits of Functions 


The behavior of a function around a point of accumulation of its domain (but not 
necessarily in the domain) can be described by using the concept of limit. 

In the sequel we consider functions f : A — R (defined on subsets A of R) and 
points a € R, which are points of accumulation for A. 


6.2.1 Definition We say that @ € R is the limit of the function f at the point a 
(abbreviated, lim f(x) = @), if for every neighborhood V of @, there is a neighbor- 
Xa 


hood U of a such that 
x e€U,x#a implies f(x) eV. 
If the limit exists, then it is unique. This is an easy consequence of the property 


of separation of the topology of R. 


6.2.2 Remark It is important to also have at hand the characterization of limits with 
e and 6. The formulation depends on the nature of a and ¢. For example: 


(a) If a and ¢ both are real numbers, then lim f(x) = ¢ means that given < > 0, 
xa 
there is 6 > 0 such that 


x €A,0<|x—a| <6 implies |f(x)—€| <e. 
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(b) If a is a real number, then lim f(x) = oo means that given € > 0, there is 
xa 
6 > O such that 


x €A,0<|x—a| <6 implies f(x) >. 


(c) lim f(x) = —oo means that for every ¢ > 0, there is 6 > 0 such that 
xXx—>0O 


xe€A,x>O6 implies f(x) < —e. 


The results concerning the algebraic operations with limits of functions are similar 
with those on limits of sequences and thus, the details are left to the reader. See 
Exercise 2. 

Some simple examples of limits are as follows: 


(a) Ifa > 0, then lim x® = coand lim x-° =0. 
. xX—>0O . x—>0O 
(b) lim eX =coand lim e* =0. 
x—>>0CO X= =O 


(c) lim logx = —oo and lim logx = ow. 
x0 X00 


6.2.3 Theorem (Heine’s characterization of limits) Let f: A — R be a function 
and let a € R be a point of accumulation for A. Then, lim f(x) = @ if and only if 
xa 


Ff (Gn) > € for every sequence (an)n of points of A\{a} with an —> a. 
The connection between the continuity and the existence of limits is now clear: 


6.2.4 Theorem Let f: A — R be a function and let a € A be a point of accu- 
mulation for A. Then, f is continuous at a if and only if the limit of f at a exists and 
equals the value of f at that point. 


As a theoretical application of Theorem6.2.4, we mention here the extension 
by continuity. Briefly, it can be described as follows. Suppose that f : A > R 
is a function and a € R\A is a point of accumulation of A such that the limit 
lim f(x) = @ € R exists. Then, the function 


rae x, | f(x) ifxeA 
FrAUa>R, faye nn, 
represents an extension of f to A U {a} and this is the only one which is continuous 
at the point a. 

The extension by continuity can be derived via the following result: 


6.2.5 Theorem (Cauchy’s Criterion for limits) Let f: A > R be a function and let 
a € R be a point of accumulation for A. Then, f admits a finite limit at a if and only 
if for every € > 0, there exists 6 > O such that 


x,y € (A\{a}) 9 I5(@) implies | f(x) — f(y) < €. 
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Recall that J5(a) = (a — 6,a + 6) ifa € R, Is(a) = (—oo, —6) if a = —oo and 
I5(a) = (6, 00) ifa =o. 

A refinement of the concept of limit at a point is provided by one-sided (or lateral) 
limits: 


6.2.6 Definition Let f be a function defined on a set A and let a be a point of 
accumulation of A M (—oo, a). We say that ¢ € R is the left-handed limit of f at a 
(that is, the limit of f(x) as x approaches a from the left), if for every neighborhood 
V of €, there exists a neighborhood U of a such that for every x € U with x < a, 
we have f(x) € V. 


The situation in Definition 6.2.6 is denoted by 


€= lim f(x), orf=f(a—), or€= lim f(x). 
x7 a— Xa, X<a 
In a similar way, one defines the notion of right-handed limit of f at a (denoted 
£= lim f(x),or€= f(at+t),or£= lim  f(x)); of course, this supposes that 
x—>at+ xd, x>a 
a is a point of accumulation for AM (a, 00). 
Clearly, the following test for the existence of limits holds true: 


6.2.7 Proposition £ € R is the limit of the function f at a point a if and only if each 
of the one-sided limits which makes sense exists and equals €. 


Exercises 


1. Consider the polynomial functions P(x) = agx” + ax"! 4... + ap, and 
Q(x) = box” + byx"—! +... + b,, where ag and bo are dfferent from zero. 
Prove that lim P(x) = (sgn ao) - co and 

X—>0O 


P(x) 0 ifm <n 
— ao/bo ifm=n 
sgn (ap/bo)- co ifm >n. 


1m => 
x00 Q(x) 


2. (The algebraic operations with limits of functions). Suppose that 


. — . _ / 
Ps £@) aan’ Pe Ge =a 


where ¢ and ¢’ belong to R. Prove that lim (f(x)+9(x)) = €+ &’ and 
x—>0O 

lim (f(x)g(x)) = €¢’ as long as the operations with £ and ¢’ make sense. 

xO 


3. (Taking the limit of inequalities). Most of the results concerning the limits of 
sequences can be extended to limits of functions. 
(a) Let A be a subset of R and let a be a point of accumulation of A. Prove that 
for any functions f, g: A > R having limit at a and f < g on A \ {a}, 


lim f(x) < lim g(x). 
xa xa 
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(b) Formulate and prove the analogue of the Squeeze Theorem. 
(c) Prove that ul x [+ | i 
x= i. 


(d) Infer from Corollary 4.3.4 that nua sina =1. 
P tor ‘ 


. Let A bea subset of R and let a be a point of accumulation of A. If f,g: A> R 


are two functions such that f is bounded and lim g(x) = 0, prove that the limit 
x—a 


of fg at a exists and equals 0. Infer that 


. Determine the set of continuity of the function 


1 AX 
PR FOS te 


noo | +e" : 


. Prove Heine’s characterization of limits (Theorem 6.2.3 above). 
. Prove Cauchy’s Criterion for the existence of finite limits (Theorem 6.2.5 above). 
. (A continuous analogue of Stolz—Cesaro Theorem). Suppose that f is a real- 


valued function bounded on every compact subinterval of [0,00) and 
jim, (f(x + 1) — f(x)) = a. Prove that Jim f@) = =a. 
[o,@) 


. (Asymptotic Formulae). It is often useful to compete the behavior of a function 


with that of a usual function, such as a polynomial function, the exponential, and 
the logarithm. In this philosophy occurs the existence of the slant asymptote at 
oo of a function f. It consists in finding two real constants m and n such that 


lim (f(x) —mx —n) = 0. 
x00 
These constants are provided by the formulas 


m= lim am and n= lim (f(x) — mx). 
X00 X— 00 
In a similar way, we may introduce the slant asymptotes at —oo. When m = 0, 
we speak on horizontal asymptotes . A vertical line x = a is a vertical asymptote 
to the graph of a function y = f(x), x € I, ifa isa finite point of accumulation 
of J and at least one of the limits lim f(x) and lim f(x) is infinite. 
xX>a— Xa+ 


Find the asymptotes of the functions 


(a) f(x) =vx-+1—x, x ER; (b) 900) = x ER\ {Il}. 


(Asymptotic notations). Given two functions f,g : A — R and a point of 
accumulation a of A, we denote 


f =0(g) 
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or, f = O(g) for x — a (to be more accurate), if there exists a neighborhood 
U of a and a constant C > 0 such that | f(x)| < Cg(x) for x € U \ {a}. 
Analogously, we denote 


f =o(g) or f =o(g)asx >a 


if there exists a neighborhood U of a and a function ¢ : A — (0, co) such that 
lim e(x) = 0 and | f(x)| < e(x)g(x) for x € U \ {a}. 
xa 


Asymptotic equivalence is introduced as follows: 


frgax aif ees 
xa g(x) 


These symbols, introduced by E. Landau, give a very convenient expression of 
many asymptotic formulas. Prove that: 


(a) 2x3 —3x+1= O(x3) (asx > ov); 
(b) Va? +x =a+%+4O0(x’) (asx > 0); 
(c) “= 1+o0(x) (asx > 0). 


[Hint: (c) Use Corollary 4.3.4.] 


6.3 Discontinuities of a Function of Real Variable 


We start with the following variant of Theorem 6.2.4, concerning the characterization 
of continuity via limits: 


6.3.1 Theorem Let f: A — R be a function defined on a subset of R. Ifa € A 
and a is a point of accumulation, then f is continuous at a if and only if each of the 
one-sided limits of f at a (that makes sense) exists and it equals the value of f at a. 

The function f is automatically continuous at every isolated point of its domain 
of definition. 


Theorem 6.3.1 leads to the following classification of the points of discontinuities 
of a function of real variable. A point of discontinuity is said to be of the first kind 
if the one-sided limits (that make sense) exist and are finite (and at least one of them 
differs from the value of the function at that point). All other points of discontinuity 
are said to be of second kind. 


6.3.2 Theorem Let I be a nondegenerate interval and let f: I + RR be a monotone 
function. Then, f has at most discontinuities of the first kind, and their set is coun- 
table. 
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Proof Suppose, for example, that the function f is increasing and consider a point 
a € I for which the limit to the left makes sense. We shall show that this limit exists 
and is equal to £ = sup{ f(x); x eI, x <a}. In fact, f(x) < f(a) forx <a, 
which assures the existence of @ in R; then for every « > 0, there exists a point 
xX- < asuch that €> f(x-) > €—e. As the function f is increasing, we must have 
L£> f(x) > €—€ for every x € [x-, a), and thus £ = jim f(x). The case of the 
points at which the limit to the right makes sense can be settled in a similar way. At 
the interior points, both one-sided limits make sense and Jim f(x) < am. f(x). 


Let D be the set of all points of discontinuity of f. To each a € D, we attach a 
nondegenerate interval 


Ia = (tim f(x), lim, fi) , 


with the convention that the one-sided limit, which does not make sense is replaced 
by f(a). Clearly, ifa,b € Danda < b, then J, 1 I, = Y. In this way, choosing for 
eacha € D arational point rg € Ig, we get a one-to-one map a > rg, from D into 
Q. Consequently, the set D is countable. 


A characterization of the functions f : [a,b] — R, which admit only disconti- 
nuities of the first kind makes the object of Theorem 9.1.6. 

In connection with Theorem 6.3.2, let us recall here the result of A. Froda, which 
asserts that the set of points of discontinuity of first kind of every function f :R — R 
is countable. Details can be found in Exercise 6. 

Dirichlet’s function has no point of continuity. Its discontinuities are all of second 
kind. 

Another surprising example of discontinuous function is offered by Riemann’s 
function R : (0, 1] > R, which is defined by the formula 


R(x) = 0 if x is irrational 
= l/q ifx= : and p,q € N* with no common factors. 

This function is continuous at every irrational point and discontinuous at every ratio- 

nal point of (0, 1]. In fact, 


lin R(x) =0 foralla € (0, 1]. 

xa 
See Exercise 3. By extending this function by periodicity, we obtain a function 
R : R — R continuous at all irrational points and discontinuous at all rational 
points of R. 

It is natural to ask whether there exist functions f : R — R continuous at all 

rational points and discontinuous at all irrational points of R. Since both sets Q and 
R \ Q are dense in R, the following result shows that the answer is negative. 
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6.3.3 Theorem (V. Volterra) Let f, g : [0,1] — R be two functions such that the 
subsets of their points of continuity (Cf and Cy) are, respectively, dense in the 
interval [0, 1]. Then, f and g have a point of continuity in common. 


Proof Put Ig = [0, 1] and choose a point po € Cf M int Jp. The existence of po is 
assured by the fact that Cy is dense in Jo. 

Using the definition of continuity at po, we infer the existence of a nondegenerate 
compact subinterval Jo C Jp, centered at po, such that 


x,y € Joimplies | f(x) — f(y)| < 1. 


Similarly, we can choose a point go € Cy M int Jo. By the continuity of g at go, 
we infer the existence of a nondegenerate compact subinterval [; C Jo, centered at 
qo, such that 

x,y € 1; implies |g(x) — g(y)| < 1/2. 


By mathematical induction, we can choose a nested sequence (J), of compact 
intervals such that 


If) — FO < 1/2"! and |g) — g(y)| < 1/2” (6.1) 


for all x, y € J, and alln € N*. By the Nested Intervals Lemma, there exists a point 
z in the intersection of all these intervals. Because of relations (6.1), z is necessarily 
a point of continuity both for f and g. 


The theorem of Volterra raises the question on the topological nature of the set of 
discontinuity of a function of real variable. This can be easily solved via the concept 
of oscillation at a point. 

Let f : J — Rbea function defined on a nondegenerate interval. The oscillation 
of f ata point a of J is defined by 


wy(a) = inf{diam f(U):U © Vg}. 
The following connection between continuity and oscillation is straightforward: 
6.3.4 Lemma The function f is continuous at a if and only if wf (a) = 0. 
Since for every r > 0, the set 
{x el: w(x) <r} 


is open, from Lemma6.3.4 it follows that the set of points of continuity of every 
function f : I + Ris of type G5 and the set of points of discontinuity is of type Fz. 

By Theorem 6.3.3, the set R\ Q is not of type F,. A more direct argument is 
offered by Baire’s Category Theorem. 
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It is worth noticing that every set A C R of type F; is the set of discontinuity of 
some function f : R — R. See Gelbaum and Olmsted [1]. The particular case of 
closed sets makes the object of Exercise 5. 


Exercises 


1. Exhibit an example of function f : [0, 1]— R that maps the convergent sequences 
into convergent sequences and which admits discontinuities. 
2. Let (f7)n>1 be an enumeration of the rational numbers in [0, 1]. Prove that the 


function 1 


f(x) = o> an 


{ni Pn <x} 


is strictly increasing and its discontinuities form a countable dense set. 

3. Prove that the extension by periodicity of Riemann’s function to R has limit zero 
at every point. 

4. Adapt the argument of Theorem 6.3.3, to infer a stronger conclusion, namely, that 
the intersection C¢ 1 C,, of the sets of continuity, is an uncountable set, dense in 
[0, 1]. 

5. Let A be a closed subset of IR. Verify that the characteristic function of the set 
B = OA U (int AN Q) is an example of function f : R— R whose set of 
discontinuities is exactly A. 

6. (Froda’s Theorem, see [13]). The purpose of this exercise is to sketch a proof of 
the fact that every function f : R — R may have only countably many points of 
discontinuity of first kind. 

(a) Prove that if ¢ > O and € is limit of a sequence (a,), of discontinuities of 
the first kind, distinct two by two, for which wy(a,) > €, then € is a point of 
discontinuity of the second kind. 

[Hint: Suppose that the sequence (a,), 1s increasing. Then, in every interval 
((an—1 + Gn)/2, (Qn + Gn41)/2), one can choose points u, and v, such that 
fn) — fn) > ©/2. If € would be a discontinuity of first kind, then 
lim f(u,)= lim f(v,) = lim f(x), acontradiction.] 

noo noo xf 


(b) Prove that the set D,,, of all points a of discontinuity of the first kind such that 
w (a) > 1/n, is countable. 

(c) Conclude that every function f : R — R may have only countably many 
points of discontinuity of first kind. 


6.4 The Intermediate Value Theorem 


A remarkable property of continuous functions (known as the intermediate value 
property) is that of mapping the intervals into intervals. This follows easily from 
Proposition 2.5.1 (which asserts that the real intervals are precisely the convex subsets 
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of IR) and the following result due to B. Bolzano and A.-L. Cauchy, which shows 
that continuous functions map intervals into convex sets. 


6.4.1 The Intermediate Value Theorem Let I be a nondegenerate interval and 
f : 1 — R be a continuous function. Then, for every two points a and b in I and 
every number X between f (a) and f (b), there exists a point c between a and b such 
that \ = f (c). 


The Intermediate Value Theorem is an immediate consequence of the following 
lemma (applied toh = f — A): 


6.4.2 Lemma Leth : [a,b] = R be a continuous function such that h (a) < 0 and 
h (b) > 0. Then, there exists at least one point c € (a, b) such that h (c) = 0. 


Proof The set A = {x : x € [a,b] and h(x) < 0} is nonempty and bounded above, 
so by Completeness Axiom, A admits a supremum, say c. Since c is a closure point 
for A, by Proposition 5.4.4 it follows that c is the limit of a sequence of elements of A. 
By Heine’s characterization of continuity, we infer that /(c) < 0. On the other hand, 
from c = sup A it follows that for every positive integer , there exists a point x, in 
(c,¢ + 1/n) such that h(x,) > 0. Then h(c) = im. f (Qn) = 0, by the continuity 


of f. Consequently, h(c) = 0. 


The main applications of the Intermediate Value Theorem concern the existence of 
solutions for the equations of the form f(x) = 0, where f is a real-valued continuous 
function defined on an interval /. 


6.4.3 Example (The nth root of a positive number) The existence of the nth root of 
a positive number was established in Proposition 1.3.1 as a direct consequence of 
Completeness Axiom. Here, we provide a much shorter argument. Let a > 0 and 
n €N,n => 2. The function 


h:[0,c0) ~ R, h(x) =x" —a, 


is obviously continuous and verifies h(O) < 0. If a > 1, then h(a) > 0, while if 
a € (0, 1), then h(1) > 0. According to Lemma 6.4.2, the equation 


x" =a 


has a positive solution x*. Since h is strictly increasing, this is the only positive 
solution. 


Establishing the existence of a solution is only the first step in solving an equation. 
The next step, even more important, is to find efficient algorithms to produce the 
solution. 

Suppose our equation is of the form 


f(x) =0, (6.2) 
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(a) (b) 


Fig. 6.2 a f(x) is large though x is near x*, b f(x) is small but x is far from x* 


where f is a real-valued continuous function defined on an interval J, having a 
unique solution x* € J. Typically, any algorithm for finding x* produces a sequence 
(xXn)n of points of J converging to x*, called approximate solutions. Since f is 
continuous, the sequence (f (x,)),, converges to zero. The problem is how fast are 
these convergences. We are interested in fast algorithms such that forsomeg € (0, 1), 


x* —x, = O (q") (AS1) 
and 
f @n) = O(a") (AS2) 


as n — oo. The two conditions (AS1) and (AS2) do not imply each other. It is 
possible to be near the solution and f have large values, or f be near 0 at values far 
away from the exact solution. See Fig. 6.2. 

The next example describes an algorithm satisfying condition (AS1). 


6.4.4 Example (The method of bisection) This method yields an algorithm for find- 
ing a solution of the equation f(x) = 0, when f : [a,b] — R is a continuous 
function such that f (a) < 0 and f (b) > 0. 

At step 1, we divide [a, b] into two subintervals of equal length, by considering 
the middle point (a + b)/2: 


[a, b] = [a, (a+ b)/2] U [(a + b)/2, B]. 


If f((a + b)/2) = 0, then we get a solution and the algorithm stops here. If f does 
not vanish at (a + b)/2, then the interval [a, b] will be replaced by one of the two 
subintervals, say [a;, b;], where f has an alternation of sign at the endpoints. At 
step 2, we repeat the division procedure to [a 1, b;], getting either a solution, or a 
subinterval [a2, b2] such that bz — az = (b— a) /2? and f (a2) f (b2) < 0. Supposing 
that the middle points do not yield solutions, this algorithm provides a sequence of 
nested intervals [ay, bn], with by — an = (b — a)/2” > O and f(an)f(bn) < 0. 
The intersection x* of all these intervals is a solution of the equation f(x) = 0. 
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The endpoints a, and b, are approximations of x*, for which both |a, — x*| and 
|by — x*| are bounded above by (b — a)/2". 
When searching for the solutions of an equation of the form 


it is sometimes convenient to rewrite it as a fixed point problem, 
g(x) =x. 


The fixed points of the function g are precisely the solutions of the equation g(x) = x. 
Geometrically, they are the abscissas of intersection points of the graph of g with the 
line y = x. 

Lemma 6.4.2 yields the following fixed point theorem: Every continuous function 
Ff : [a,b] — [a, b] has a fixed point. See Exercise 1(a). There are many other known 
fixed point theorems. See Exercise 8 at the end of this section and Theorem A.1.2 in 
Appendix A. 

The intermediate value property is not characteristic to continuous functions. See 
Exercise 4 for an example. 

The fact that any derivative has the intermediate value property makes the object 
of Theorem 8.3.2. 


6.4.5 Proposition Let I be a nondegenerate interval and let f : I — R be an 
injective function with intermediate value property. Then, f is strictly monotone. 


Proof Suppose that f is not strictly monotone. Then, there exist a,b,c € I such 
thata < b < cand f (b) is not between f (a) and f (c) . In other words, one of the 
following cases may occur: 


Gd) fO<f@<fo 
2) fla<foO<f® 
B) fO<fO<f@ 
4) fi<f@<f). 


Suppose the case (1) and let \ = f (a). Since f has the intermediate value 
property, there exists a € (b,c) such that \ = f (a). Since a ¥ a, this fact 
contradicts the injectivity of f. The other cases treat similarly. 


We pass now to the problem of inverting continuous functions defined on intervals: 


6.4.6 Theorem Let I be an interval and let f : I > R be a continuous injective 
function. Then, f establishes a homeomorphism between the intervals I and J = 


ff). 


Proof According to Proposition 6.4.5, the function f is strictly monotone. Suppose, 
for example, that f is strictly increasing. Then f induces a bijection between J and J, 
and f~! is strictly increasing too. By Theorem 6.3.2, the function f~! has at most 
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discontinuities of first kind. Actually no such discontinuity may occur. In fact, if 
f-'(b) =a and limy.,— f~'(y) = A #a, then \ <a since 


y <b implies f~!(y) < f7'(b) =a. 


If a sequence y, = f(x») approaches b from the left, then the sequence x, 


approaches \ from the left, so \ € J since x, < \ <a and / is an interval. Then the 
continuity of f leads to 


a contradiction. 


fn) > FN < f@ =), 


Exercises 


I; 


(a) Prove that every continuous function f : [a,b] — [a, b] has a fixed point. 
(b) Prove that if the function f : [—a,a] — [-—a,a] is continuous, then the 
equation f(x) = — x has at least one solution. 


. Prove that there exists no continuous function f : R — R such that 


(fo f) (x) =x? forallx ER. 


[Hint: If a function g is strictly monotone, then g o g is strictly increasing. ] 


. Prove that a function with the intermediate value property may have only discon- 


tinuities of the second kind. 


. Consider a real parameter a and the even function cg : R— R such that 


Cq(0) = a, and for every n € Z, cq(2") = (—1)" and cg is affine on [2”, 2”+1]. 
(a) Sketch the graph of this function. 

(b) Prove that the function cg has a discontinuity of second kind at the origin, 
whenever a € R. 

(c) Prove that the function c, has the intermediate value property if and only if 
a €[-1, 1]. 


. Prove that if a < b, then none of the intervals (a, b), [a, b) and [a, b] can be 


pairwise homeomorphic. 


. Let f : [(0,c0o) — R be a continuous surjective function. Prove that for every 


a €R, the equation f(x) = a has infinitely many solutions. 


. Prove that the equation 


x + /% + 1000 sinazx —e =0 


has at least three solutions. 


. (Knaster’s Fixed Point Theorem). Prove that every increasing function f : 


[a,b] — [a,b] admits at least a fixed point. Indicate an example where the 
set of fixed points is countable. 
[Hint: Consider the point c = sup{x € [a, b]: x < f(x)}.] 
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6.5 Continuous Functions on Compact Intervals 


The continuous functions defined on compact intervals have a series of properties, 
which make them very special. 


6.5.1 Theorem (K. Weierstrass) Let f : [a,b] — R be acontinuous function. Then 
fis bounded and attains its bounds. In other words, there exist two points Xm and 
xm in [a, b] such that 


PO) = ae f(x) and f(xm)= sup f(x). 


xe€[a,b] 


Proof We prove first that the function f is bounded below and it attains its greatest 
lower bound. In fact, letting m = infx<fa,p] f (x), we infer the existence of a sequence 
(Xn)n Of elements of [a, b] such that f (x,) — m. By the Bolzano-Weierstrass Theo- 
rem, a subsequence (xx, )n of itis convergent, say to c. Using Heine’s characterization 
of continuity, we conclude that f (xk, ) — f(c), and thus f(c) = m. 

The case of lowest upper bound follows from the considerations above since 


sup f(x) = a (—f(x)). 


xe[a,b] 


Taking into account the intermediate value property, the preceding result gives us 
a valuable information on the image of a continuous function: 


6.5.2 Corollary Let f : [a,b] — R be a continuous function and let m and M be 
its greatest lower bound and its least upper bound, respectively. Then, f ({a, b]) = 
[m, M]. 


6.5.3 Corollary Suppose that —oo <a < b < ow. Then every continuous function 
f : [a,b) > R that admits a finite limit at b is bounded. 


Another remarkable property of continuous functions on compact intervals is that 
of uniform continuity. 


6.5.4 Definition Let A be a subset of R. A function f : A > R is called uniformly 
continuous if for every « > 0, there exists 6 > 0 such that 


x,y EA, |x—y| <6 implies | f(x) — f(y)| <e. 


Clearly, uniform continuity implies continuity. While continuity has a pointwise 
character, the property of being uniformly continuous has a global character, reflect- 
ing the behavior of the function on the entire domain. By attaching to a function 
f : A— R the so called modulus of continuity, 
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Fig. 6.3. The property of uniform continuity 


wr(d)= sup |f(x)—f)|, 6>9, 
oo 


we can express the property of uniform continuity of f as 


li 6) =0. 
a 


Geometrically, the property of uniform continuity means that whenever B is a 
subset of A with diameter less than 6, the diameter of its image should be less than ¢. 
See Fig. 6.3. 

Clearly, the Lipschitz functions (particularly, the polynomial functions of the form 
ax + b) are uniformly continuous. The set of all uniformly continuous functions 
f : A — Risa vector space with respect to pointwise operations of addition and 
multiplication by scalars. However, it is not generally an algebra. In fact, the product 
of two uniformly continuous functions may not be an uniformly continuous function. 
For example, the function f(x) = x”, x € R, is not uniformly continuous since 


wf (d) = at |(x + 5)? — x?| = sup |2xd + 67| = 00 


xe xeR 


for every 6 > 0. 


6.5.5 Theorem (H.E. Heine) Every continuous function f : [a,b] > Ris uniformly 
continuous. 


Proof By reductio ad absurdum. Suppose that there exist ¢ > 0 and two sequences 
(Xn)n and (yy)py in [a, b] such that |x, —y,| < 1/nand|f(xn)— f On)| = € for every 
index n > 1. By Bolzano-Weierstrass Theorem, we can choose a convergent subse- 
quence (Xx(n))n Of (Xn )n. Analogously, (yg(n) )n Contains a convergent subsequence 
(vickiny) )n- Put lim Xk(n) =X and lim Vi(k(n)) = Y- AS 

noo noo 


1 1 


x, - < ——_ <- 
IX1(k(ny) — Yuck(ny) | ay a 
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for every index n > 1, we infer (by taking the limit as nm — oo) that x = y. On the 
other hand, the relation | f (x7(K(n))) — f Quceqn)y)| = € leads us to | f(x) — f(y)| = €, 
a contradiction. 


6.5.6 Corollary Every continuous function f : R — R for which lim f(x) = 
xX—>—00 

lim f(x) = 0 is uniformly continuous. 

X> CO 


Proof According to our hypotheses, fore > 0 arbitrarily fixed, there exists a compact 
interval [—A, A] outside which | f(x)| < ¢/2. Since the function f is uniformly 


continuous on [—A — 1, A+ 1], there exists 6 € (0, 1) such that 


x,y €[-A—1,A+ 1] and |x — y| < dimply | f(x) — fy)| <e. 


Therefore for all x, y € R with |x — y| < 6 we have | f(x) — f(y)| <«. 


See Exercise 3 for a generalization of Corollary 6.5.6. 


6.5.7 Corollary Every continuous and periodic function f : R > R is uniformly 
continuous. 


The uniformly continuous functions map Cauchy sequences into Cauchy seq- 
uences. This fact leads us to the following property of extension: 


6.5.8 Theorem The uniformly continuous functions can be extended by continuity 
at every finite point of accumulation of their domain of definition. 


Proof In fact, let f : A > R be a uniformly continuous function and leta € R\ A 
be a point of accumulation of A. If (x,)n is a sequence of elements of A, convergent 
to a, itis a Cauchy sequence and according to a remark above, the sequence (f (Xn))n 
must be Cauchy in R, so convergent, according to the property of completeness of 
R. 
The proof ends by noticing that the value @ of the limit lim f(x,) does not 
noo 


depend on the particular sequence (x,),, such x, —> a. 
For this, we use the method of interlacing. If x, — a and y, — a, then the 
sequence 
XO, YO: X15 Vly ---5 


obtained by interlacing the terms of the two sequences, is also convergent to a. The 
above reasoning shows that the sequence of their images 


f(xo), FQ), FO), FOD,--- 


is convergent,so lim f(x,) = lim f(y,). According to Heine’s Criterion on the 
noo now 


existence of limits, this fact will yield lim f(x). 
x—a 
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Since the function sin(+) cannot be extended by continuity at x = 0, 
Theorem 6.5.8 shows us that this function is not uniformly continuous on R \ {0}. 


Exercises 


1. Prove that the function f(x) = ./|x] is uniformly continuous on R, but not 
Lipschitz on any interval containing the origin. 

2. (A variant of Theorem 6.5.1 for noncompact intervals). Suppose that f is a con- 
tinuous function from R into itself, whose sublevel sets L, = {x : f(x) < A} are 
compact for all \ € R. 

(a) Prove that f admits a global minimum, that is, a point c € R such that 
f(c) < f(x) for every x € R. 

(b) The compactness of the level sets can be easily checked for continu- 
ous functions whose limit at —co and oo is oo. Verify this for the function 
(x + sinx?)e* — 5(x + cos x7)e~*. 

3. Prove the following generalization of Corollary 6.5.6: Every continuous function 
f : R — R which admits slant asymptotes at —oo and at oo is uniformly 
continuous. 

4, Prove Corollary 6.5.7: Every continuous and periodic function f : R > R is 

uniformly continuous. 

Prove that the function cos x~ is not uniformly continuous on R. 

6. Let f : R — R be a uniformly continuous function. Prove that there exist two 
positive constants A and B such that 


2 


ai 


[f(x)| < Alx|+ B forallx ER. 


7. Prove thatafunction f : R + Riscontinuous if and only if f has the intermediate 
value property and maps the compact sets into compact sets. 
[Hint: Use Exercise 3 in Sect. 6.4.] 
8. Prove that: 
(a) cosx + sin./2x <2 forall x € R; 
(b) sup{cos x + sin J2x :x € R} =2. 
Infer that the function cos x + sin V/2x is not periodic. 
[Hint: See Kronecker’s Theorem of denseness (Exercise 5 at the end of Sect. 1.4).] 


6.6 Continuous Functions on Metric Spaces 


In what follows, we will discuss the notion of continuous function in the context of 
metric spaces. The necessity of such an extension is motivated by the understanding 
of some deep phenomena within Real Analysis on intervals (for example the chaotic 
behavior of some logistic maps). See Appendix A. Another reason is the construction 
of elementary functions as continuous functions of complex variable. See Chap. 7. 
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Definition 6.1.1 of continuity of a function at a point a, can be adapted easily to 
the context of metric spaces by observing that the meaning of |x — a| in that context 
is d(x, a). 

Let M and N be two metric spaces. 


6.6.1 Definition A function f : M — N is called continuous at the point a € M if 
for every ¢ > 0, there exists 6 > 0 such that 


x € M,d(x,a) <6 implies d(f(x), f(a)) <e. 


A continuous function is a function continuous at any point of its domain. The 
functions, which are not continuous, are said to be discontinuous. 


An important class of continuous functions is that of Lipschitz functions. A func- 
tion f : M —> N is called Lipschitz if there exists a real constant L > O such 
that 


d( f(x), f(y)) < Ld(x, y) for every x, ye M. 


The Lipschitz constant of f is defined by the formula 


Lip(f) = sup LLG) £0) 
xy d(x, y) 


and represents the smallest number L > 0 for which f verifies the Lipschitz condi- 
tion. 

Examples of Lipschitz functions are the projections pr, : R? — R, the norm 
function of any normed linear space, the distance functiond : M x M => R ete. 
The Lipschitz functions f with Lip(f) < 1 are called contractions. The useful- 
ness of contractions in solving nonlinear equations is outlined in Theorem A.1.2, 
Appendix A. 

The topological characterization of continuity extends verbatim to the context of 
metric spaces. 


6.6.2 Proposition (The topological characterization of continuity) A function f : 
M — N is continuous at a point a if and only if the inverse image of every neigh- 
borhood of f (a) is a neighborhood of a. 


Proposition 6.6.2 assures the extension to the context of metric spaces of 
Theorem 6.1.3 as well as of its consequences: Corollary 6.1.4, Theorem 6.1.5 (Heine’s 
characterization of continuity in terms of convergent sequences), Lemmas 6.1.6 and 
6.1.7 and so on. A consequence of Heine’s characterization of continuity is that a 
function f : M — C is continuous at a point a if and only if its real part and its 
imaginary part are both continuous at a. 


6.6.3 Theorem (The extension of Weierstrass’ Theorem) Continuous functions map 
compact sets into compact Sets. 
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Particularly, every continuous function f : K — R, defined on a compact metric 
space K is bounded and attains its bounds. 


Proof Let f : M — N be acontinuous function and let K be a compact subset 
of M. We shall show that f(K) is a compact subset of N. For, let (Da)aca be an 
open covering of f(K). Since f is continuous, the inverse images f~!(D,) are open 
sets and it is clear that their union covers K. Since K is compact, one can choose a 
finite subcovering (f —!(Da))aer of K and this yields a finite covering (Da)aer of 
jf (K). Consequently, the set f(K) is compact. 


6.6.4 Theorem (H.E. Heine) Every complex-valued continuous function f defined 
on a compact metric space K is uniformly continuous, that is, for every ¢ > 0, there 
exists 0 > 0 such that 


x,y € K, d(x, y) <6 implies |f(x)-— fQ)| <e. 


The proof of Theorem 6.6.4 mimics the case of functions defined on compact 
intervals. See Theorem 6.5.5 above. 
The unit circle, 
S=eCred=1, 


is a compact space with respect to the topology induced by the Euclidian distance 
in the plane. It is also a commutative group with respect to multiplication and the 
operation of multiplication is uniformly continuous on S! x S!, 

The notion of limit, as well as Heine’s criterion of existence of limits, also extends 
in a straightforward manner. However, the concept of one-sided limit remains spe- 
cialized to functions defined on intervals of R. 


6.6.5 Limit inferior and limit superior of a function Assume f : A > Risa 
function defined on subset A of the metric space M and a € M is a point of accu- 
mulation of A. Following the case of sequences, we can attach to f two limiting 
bounds at a, which exist in R even when the limit at a fails to exist. They are the 
limit superior, 
lim sup f(x) =lim( sup. f()), 
xa €>0 xeAN(B-(a)\{a}) 


and respectively the limit inferior, 


lim inf =i inf : 
por F@) oa0 Sta tah fey 


As usually, B-(a) denotes the open ball of radius ¢ > 0, centered at a. 
Clearly, 


—oo < inf f(x) < liminf f(x) < limsupf(x) < sup f(x) < o, 
xeEA xa xa xeA 


and the limit of f at a exists and equals @ € R if and only if 
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lim inf f(x) = lim sup f(x) = @. 
xa 


xa 


We end this section by discussing two classes of functions playing weaker variants 
of continuity. 

A function f : M — RU {oo} is called lower semi-continuous at the pointa € M 
if for every \ < f(a), there exists a neighborhood U of a such that 


x € Uimplies f(x) > A. 


The function f : M > RU {—oo} is called upper semi-continuous at the point 
a € M if for every 4 > f(a), there exists a neighborhood U of a such that 


x € U implies f(x) < wp. 


Clearly, a function f : M — RU {—oo} is upper semi-continuous at a if and only 
if the function — f is lower semi-continuous at that point. This reduces the study of 
upper semi-continuity to lower semi-continuity. 

A function f : M — R is continuous if and only if it is in the same time lower 
semi-continuous and upper semi-continuous at all points. 

The characteristic function of a subset A of M is lower semicontinuous on M if 
and only if A is open. See Exercise 6. 

When all points of M are points of accumulation, the property of a function 
f : M — RU {oo} of being lower semi-continuous on M is equivalent to 


f(a) < liminf f(x) for every a € M. 
xa 


6.6.6 Theorem Every lower semi-continuous function defined on a compact metric 
space K is bounded below and attains its lower bound. In other words, there is a 
point a € E such that f(a) = a f(). 

Xx€ 


Proof Indeed, for each \ > m = a J (x), let us denote by K) the set of allx € K 
XE 


such that f(x) < A. This set is closed (see Exercise 6) and nonempty. Every finite 
intersection K,, 1 K), 1--- Ky, is nonempty as being equal to Kmin{),,...,\n}- 
Since K is compact, (]\.,,, K, is nonempty. Let a be a point in this intersection. 
Then f(a) <_m, so f(a) = m due to the definition of m. 


Exercises 


1. Prove that all polynomial functions agz” + ayz"~! +--+ + ad, (with complex 
coefficients) are continuous on C. 

2. (Heine’s characterization of continuity). Let M and N be two metric spaces. 
Prove that a function f : M — N is continuous at a point a if and only if it 
maps the sequences convergent to a into sequences convergent to f(a). 
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10. 


11. 
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. Prove that every continuous function f : R — R which admit finite limits at 


—oo and oo has a unique continuous extension to R. 


. Prove that for every continuous function f : S' > R, there exists a pair of 


opposite points P, Qe S '’ such that f(P) = f(Q); here S ! represents the unit 
circle in the complex plane. 


. Let f : C > C be a continuous function such that lim f(z) = 0 (that is, for 


|z|—>00 
every € > 0, there is 6 > 0 for which |z| > 6 implies | f(z)| < €). Prove that: 
(a) f is bounded and uniformly continuous; 
(b) the inverse image through f of every compact set is a compact set. 


. Given a function f : M — RU {oo}, prove that the following conditions are 


equivalent: 

(a) f is lower semi-continuous on M; 

(b) {x € M: f(x) > A} is an open set for every  € R; 

(c) all of the lower level sets {x € M: f(x) < A} are closed. 


. Let M be a metric space. Prove that the characteristic function of a subset A of 


M is lower (upper) semicontinuous if and only if A is open (closed). 


. (a) Prove that the sum of two lower semicontinuous functions, as well as their 


product when the functions are also positive, is a lower semicontinuous function. 
(b) Prove that the minimum of finitely many lower semicontinuous functions is 
also a lower semicontinuous function. 

(c) Suppose that (fj)je7 is a family of lower semicontinuous functions from the 
metric space M to R U {oo}. Prove that its upper envelope, sup;<; fi, defined 
pointwise by the formula 


(sep fi) (x) = sup fi(x), 
iel 


ie] 


is a lower semicontinuous function. 

The Exercises 7 and 8 above show that the lower semicontinuous functions 
provide a functional analogue of the notion of topology. Translate in terms of 
lower semicontinuous functions the notion of compact set. 

(M.D. Kirszbraun) Suppose U is a subset of the metric space M and f: U > R 
is a Lipschitz function. Verify that the function f : M — R defined by 


f(x) = inf { f (u) + Lip(f) lx —ull:ueU} forxe M, 


extends f and has the same Lipschitz constant as f. 
Extend Exercises 6 and 7, Sect. 6.1, to the framework of complete metric spaces. 
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6.7 The Universal Surjectivity of Cantor’s Triadic Set 


An important fact concerning the compact metric spaces is the universal surjectivity 
of Cantor’s triadic set. 


6.7.1 Theorem (P.S. Alexandrov and F. Hausdorff) For each compact metric space 
K, there is a continuous map ¢p from the Cantor set A onto K. 


Proof For K = [a, b], we can choose ¢ of the form 


an 
Qn+ 1 


g(x) =at+-a)>. 
n=1 


where x = >°°° | Ba is the triadic representation of x. 


The details in the general case may be found in many books, see for example [2], 
p. 39. 


The last theorem has many striking applications; for example, it assures the exis- 
tence of curves filling the unit square (or any other compact convex set in R?). 


6.7.2 Corollary Assume that K is a compact convex subset of R?. Then, there is a 
continuous map from the unit interval [0, 1] onto K. 


Proof By Theorem6.7.1, there is a continuous surjective map y : A — K. The 
complement of A in [0, |] is a countable union of open intervals. If (a, b) is such an 
interval, then we can extend ¢ along [a, b] by letting 


yp(i — tha+ th) = 1 —t)yp(a)+typ(b) for allt € [0, 1]. 


The method of approximating a continuous function across an interval by joining 
the values at the end points by a linear segment is known as linear interpolation. 

A continuous function y from [0,1] to R? represents geometrically a curve. Intu- 
itively, a curve is one-dimensional, but Corollary 6.7.2 shows the existence for each 
integer p > 2 ofa p-dimensional curve! Such curves are usually called Peano curves, 
after Giuseppe Peano, who discovered in 1890 the first concrete example of a curve 
whose image is the unit square. See Exercise 6 for a concrete example of such curve. 

Let E and F be two metric spaces. We say thata map f : E — F isa homeomor- 
phism if f is bijective and both f and its inverse f~! are continuous. The spaces E 
and F are called homeomorphic if there is a homeomorphism f : E — F. In this 
case, we can identify E and F from the topological point of view. 

In general, it is difficult to say whether two spaces are homeomorphic or not. 
One way is to look at the topological invariants, that is, at those properties that 
are preserved under homeomorphisms. According to Theorem 6.6.3, compactness is 
such an invariant. 
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Call a metric space a Cantor-like set if it is homeomorphic to Cantor’s triadic 
set. A Cantor-like set may look very different from A. See Exercise 4. A topological 
characterization of the Cantor-like sets is as follows: 


6.7.3 Theorem Let M be a metric space that fulfills the following three conditions: 


(a) M is compact; 

(b) M is perfect (that is, all points of M are points of accumulation); 

(c) M is totally disconnected (that is, for every point x € M, each neighborhood 
V € Vy contains a neighborhood W € Vx which is both open and closed). 


Then M is homeomorphic to Cantor’s triadic set. 


Proof We will need three steps. Step 1. For each ¢ > 0, there is a integer number 
N > 1 such that for every integer n > N, the space M admits a partition into n 
nonempty subsets, that are both open and closed and have diameters smaller than ¢. 

In fact, by (a) and (b), we get a covering (C Dee , of M, consisting of nonempty 
sets, that are both open and closed and having diameters smaller than ¢. Put 


Di, =C; 
a 

P= | C atiey ey 
i=1 


Eliminating the empty sets among D),..., Dp, we getacovering of M consisting 
of N nonempty sets that are both open and closed and have diameters smaller than ¢. 
Since M is perfect, each of these sets is infinite and the process of partitioning can 
be continued. 

Step 2. We motivate the existence of a sequence ({An, Bn})n>1 of pairs of non- 
empty subsets of M such that: 


(1) M=A,, U B, and Ay, NM By, = G for each index n. 

(2) The sets A, and B, are both open and closed. 

(3) For every pair (x, y) of distinct elements of M, there is an index n for which 
x €A,andye€ By. 

(4) For every nonempty finite subset F of N* and every choice M, € {An, Bn} for 
n € F, we have 


() Mn 4D. 


néeF 


In fact, according to Step 1, fore = 1/2, there is a partition 


consisting of nonempty sets that are both open and closed, and having diameters 
smaller than 1/2. Put 
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A = union of the sets Dy for k € {1,..., 5-2} 
A = union of the sets Dy fork € {1,..., 57 -2MYU{5-2™1,...,5- 2M) 
A3 = union of the sets D; for k in the union 
{sce cangge YUU Se OM, os gcag 290) 
ee saoegcey UL 22 ising 22} 
An, =D enorme oi 
and denote by B),..., By,, respectively, their complementary sets. 


Similarly, each of the sets D; can be further decomposed into nonempty sets that 
are both open and closed, and having diameters smaller than 1/27, 


2N2 
Dj = D; j; 
j=l 
which gives rise to the pairs {Ay,+41, By, +1},.--., {An,+N2» Bn, +N>} and so on. 


Step 3. We exhibit ahomeomorphism f : M— A. 
Let ({An, Bn})n>1 be asequence as indicated at Step 2. We define f by the formula 


fx)=2> 2 
32 


n=1 
where 


(0 tee, 
n=) 1 if x € By. 


The property (3) assures that f is injective, while property (4) implies that the 
range of f is dense in A. 

The continuity of f follows from (2) and the fact that | f(x) — f(y)| < 2/3” if 
x,y € Ax forallk € {1,...,n}. 

Therefore, the range of f is compact and thus it fills out the whole space A. 

Since M is compact, the closed subsets of M are also compact and the continuity 
of f—! follows from the fact that f maps the compact subsets of M into compact 
subsets of A. 

When M is a subspace of R, the condition of total disconnection means simply 
that M contains no nondegenerate interval. 


Exercises 


1. Prove that the Euclidean spaces R and R? are not homeomorphic. 

2. (Self-similarity of A). 
(a) Prove that the linear map f(x) = 3x establishes a homeomorphism between 
AN [0, 1/3] and A. 
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. (Benyamini [3]). Consider the metric space M = [—1, 1 
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(b) Show that the portion of A contained in any interval remaining at the nth 
stage of its construction is homeomorphic to the whole set A. In other words, 
A is similar with a part of it. 


. (A “fat” Cantor set). Consider the set M obtained from [0, 1] by a similar con- 


struction with that of A, where the middle fifth of each remaining subinterval 
is removed. Prove that M is homeomorphic with A though it is not a Lebesgue 
null set. 


. Consider the metric space {0, iy (of all sequences taking values O and 1), 


endowed with the metric 


(oe) 


lXn — Yul 
1A) =) oa 
n=0 


where x = (X,)y and y = (yy). Usually this space is denoted 2N Prove that: 
(a) d (x,y) < a if xx = ye fork = 0,...,n, while d(x, y) < 3 implies 
Xk = ye fork =0,...,7. 

(b) The canonical projections pr, : QN + R, pr, (X) = x, are continuous and 
X, > Xin QN if and only if it is coordinatewise convergent (pr; (Xn) — pr(x) 
for all k). 

(c) 2N isa perfect metric space. 

(d) ON is complete. 

(e) QN is compact. 

(f) The map f : ON = A given by f (x) = 5 ar a is ahomeomorphism. 
(g) Similar results work if the role of 0 and 1 is played by any other pair of 
distinct numbers. 

[Hint: For (e) use Exercise 9(c), Sect. 5.5.] 

WS (of all sequences 
taking values in the interval [—1, 1]) when endowed with the metric d, given 
by the same formula as in the preceding exercise. This space is complete and 
compact. 

(a) Put X = {x/2: x € A}. This is aclosed set homeomorphic to A. Prove that 


(X+m)N(X+n)=9% for allm,néeN,m xn. 


(b) By Theorem 6.7.1, there is a continuous map y from A onto M. Consider 
the map f : U,en (X +n) > [-1, 1] given by 


fit +n) = v(2t)(n) 
and extend it to [0, 00) by linear interpolation. Prove that the function f : 
[0, 00)—> [—1, 1] so obtained is continuous and for every sequence (xy), of 


points in [—1, 1] there is a point tf € [0, 00) such that 


X, = f(t+n) foralln EN. 
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Remark Benyamini [3] proved also the existence of continuous real-valued func- 
tions f on R such that for every bounded sequence (x,), in R, there is a point 
t € Ry such that x, = f(t +n) foralln €N. 

6. (Schoenberg’s example of Peano curve). Consider the periodic function of prime 
period 2, which is defined on [—1, 1] as follows: 


1 if x €[-1,-$] U[§, UJ 
3rol Wf <e[—4.—5) 
= ae 
0 if 2 l=3, 5) 
3x—1 if x e[}, 3] 


(a) Sketch the graph of ff. 
(b) Consider the curve y : [0, 1] > R? of components 


oo 32n-2 oo 32n-1 
oy, way 
n=1 n=1 


and verifies its continuity. 
(c) Motivate why every point (xo, yo) € [0, 1] x [0, 1] admits the representation 


for suitable a, € {0, 1}. 

(d) Prove that x(to) = xo and y(to) = yo for = S&, (242 + es) , by 
using the fact that 3”f9 is an even number plus the sum of the series 4am + 
2an+1 

aa eee 


7. Prove that [0,1] and [0,1] x [0,1] are not homeomorphic but this is true for A 
and Ax A. 


6.8 Sequences and Series of Continuous Functions 


The problem of approximating a function by a specific class of functions (for example, 
polynomials or rational functions, trigonometric polynomials) that have a number of 
nice properties leads to several notions of convergence for sequences of functions. 

In what follows, we will consider functions and sequences of complex-valued 
functions defined on a metric space M. 


6.8.1 Definition The sequence (fy), is pointwise convergent to a function f if 
Tn (x) — f (x) for all x € M. We denote this convergence by 
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Fig. 6.4 Pointwise y 
convergence does not rN 
preserve continuity 


fF 


and call f the pointwise limit of the sequence (fn)n. 


The pointwise convergence does not necessarily guarantee the permanence of the 
properties shared by the functions that form the sequence. For example, the sequence 
of continuous functions, 

fax) =x", x €[0, 1], 


is pointwise convergent to the function 


_ {0 ifx € (0,1) 
oa ee ee 


which is discontinuous at x = 1. See Fig. 6.4. 

In the Notes and Remarks at the end of this chapter, we will discuss the problem 
of how discontinuous the pointwise limit of a sequence of continuous functions can 
be. 

The pointwise convergence f/f), s f means that for every x €¢ M andeverye > 0, 
there is a natural number N such that | f, (x) — f (x)| < ¢ whenever n > N. The 
rank N depends on ¢, on the point x (and also on the sequence under attention). The 
phenomenon described above is caused by the impossibility of choosing a common 
rank N for all the points in the domain. We are, thus, led to consider the following 
stronger concept of convergence: 


6.8.2 Definition The sequence (f;,), is uniformly convergent to the function f if 
for every ¢ > 0, there is a natural number N € N such that for every n > N and for 
every x € M, we have 


Ifn (x) — f @)| <e. 


This will be denoted fy —> f. 


Clearly, uniform convergence implies pointwise convergence. 
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Fig. 6.5 Uniform 
convergence 


From a geometrical point of view, the uniform convergence of a sequence of real- 
valued functions f;, to the function f means that from an index N onwards, the graph 
of fn lies in between the graphs of the functions f — « and f + «. See Fig. 6.5. 


6.8.3 Theorem (K. Weierstrass) Let (fn)n be a sequence of continuous functions 
that converges uniformly to a function f. Then, f is continuous. 


In other words, the uniform convergence preserves continuity. 


Proof For simplicity, we restrict here to the case where & is an interval. The general 
case needs only obvious changes. 


Let a be a point in the domain M of f and let « > 0. Since f/f, = f, there is a 
natural number WN such that for every n > N, and every x € M we have 


Ifa @) — F @< 5. 
Next, we use the continuity of fy at a, to obtain a number 6 > 0 such that 
x éM, |x—al <6 implies [fy(x)— fv((@)l < _ 
Then, for every x € M, with |x — al| < 6, we get 
If) -f@I< lf @) — fy @I+ lfy @) — fy @|+\fy @— f @| 


224 Se 
4°33 4° = 


which ends the proof of the continuity of f at a. 


An important technique to prove that f/f, => f is to show the existence of a 
sequence (a,,), of nonnegative numbers such that: 

(a) dn > 0; 

(b) | fn (x) — f (x)| < a for all points x and all indices n. 

The details are straightforward. 

Next, we will consider series >”... fn of complex-valued functions, defined on 
the metric space M. The partial sums are defined by 


172 6 Continuous Functions 


Sn = fot fit--+fr n=O. 


6.8.4 Definition The series >°,...¢ fn is pointwise convergent if the sequence of its 
partial sums is pointwise convergent to a function S, called the sum of the series. 
This way, the sum is defined by the formula 


SQ = >|, fi) forall x. 


n=0 


The uniformly convergent series are defined in a similar way (by asking the uni- 
form convergence of the sequence of partial sums). Obviously, every uniformly con- 
vergent series is also pointwise convergent, the sum being the same. According to 
Theorem 6.8.3, the sum of a uniformly convergent series of continuous functions is 
a continuous function. 


6.8.5 Definition The series >’, fn is absolutely convergent if the series >° <9 | fn 
is pointwise convergent, that is, 


foe) 
By | fn(x)| < 00 forall x. 


n=0 


By Theorem 4.1.6, an absolutely convergent series is pointwise convergent. 
In this chapter, we are mostly interested in absolutely and uniformly convergent 
series. The main criterion about this sort of convergence is as follows: 


6.8.6 Theorem (Weierstrass’ M-Test) Let >). fn be a series of functions from 
M to C, for which there is a sequence (an)n of nonnegative numbers such that 
| fn (x)| < an for alln € Nandallx € M. 

If the series >-,,.9 4n is convergent, then the series >... fn is absolutely and 
uniformly convergent. 7 


Proof The absolute convergence of the series ae Jn follows from Theorem 4.1.5 
(Cauchy’s Criterion). Let S be its sum. According to the hypothesis, 


IS@)-VAOI=1 DD A@OlSs DY lh! 
k=0 k=n+1 k=n+1l 
< > a, > 0 
k=n+1 


as n —> oo, which implies the uniform convergence. 


The property of uniform continuity makes possible the approximation of con- 
tinuous functions by very simple functions such as the piecewise linear func- 
tions. A function f : [a,b] — R is called piecewise linear if there exist points 
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a = x9 < x1 < +++ < xX, = b such that the restriction of f to each interval 
Xk, Xk+1] is an affine function. Every piecewise linear function is Lipschitz. 


6.8.7 Theorem Every continuous function f : [a,b] > R is the uniform limit of a 
sequence of piecewise linear functions. 


Proof Given an integer n > 1, one can choose another integer m > | so large that 


b-a 1 
xX,yE la, b], Ix — y| a a implies | f(x) ~ fM| < _ 


Then, the piecewise linear function f,, whose graph is the polygonal line inscribed 
in the graph of /, and has as vertices the points 


—a 


b b-—a 
rei(ak f(a )) 
m m 


fork =0,...,m, verifies sup |f(x) — fyx(x)| < I/n. 
xe[a,b] 


In Chap. 8 we will present Weierstrass Approximation Theorem, concerning the 
approximation of a continuous function f : [a,b] > R by a uniformly convergent 
sequence of polynomial functions. 


Exercises 


1. Consider the sequence of functions 


1 
fi) = TG owe’ xER(neN). 


(a) Sketch the graphs of fi, fo, fs. 
(b) Find the pointwise limit f of the sequence (fn)n-. 
(c) Is the sequence above uniformly convergence? Give an argument. 

2. Prove that the series )°°° ; (nxe~"* — (n — 1)xe~~*) is not uniformly con- 
vergent on the interval (0, 1), though its sum is continuous on this interval. 

3. (Dini’s Lemma). Prove the following partial converse of Theorem 6.8.3: Let fy, : 

[a,b] > R be a decreasing sequence of continuous functions, that is pointwise 
convergent to a continuous function f : [a,b] > R. Then fy, + f. 
[Hint: By the hypothesis, the sequence g, = f, — f is decreasing and pointwise 
convergent to 0. Given € > 0, the sets An = {x : gn(x) < €} constitute an open 
covering of the compact interval [a, b]. This yields arank N such that A, = [a, b] 
for alln > N.] 

4. A sequence of functions (f;,), is said to be uniformly bounded on an interval [a, b] 
if there is a number M so that | f,(x)| < M for every n and every x € [a, b]. 
Show that a uniformly convergent sequence (f;,), of continuous functions on 
[a, b] must be uniformly bounded. Show that the same statement would not be 
true for pointwise convergence. 
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5. If (fn)n and (gn)n are two sequences of complex-valued bounded functions that 
converge uniformly on a set S, prove that (f,9n)n converges uniformly on S. 

6. Suppose that (f;,), 1s a sequence of functions converging uniformly to zero on an 
interval [a, b]. Show that jim, FSn(X%n) = 0 for every convergent sequence (xy) 


of points in [a, b]. Give an example to show that this statement may be false if 
Stn — O merely pointwise. 
7 Riemann’s zeta function is defined by 


[ee] 


1 
C(z) = » s for Rez > 1. 


n=1 


(a) Prove that the above series is uniformly convergent on every band | + € < 
Rez < co withe > 0, and deduce from here that the zeta function is continuous. 
(b) Prove Euler’s formula, 


n 
; 1 
C(x) = im I tai Jp: forx > 1, 


where pj = 2, p2 = 3, p3 = 5,... is the strictly increasing sequence of all 
prime numbers. 

(c) Prove that Riemann’s zeta function has no zero in (1, 00). 

[Hint: (b) Start noticing that C(x)(1-2-*) = ft +aet+at... J 


Remark The zeta function was first studied by Euler (1737). See Sect. 8.7. Nev- 
ertheless, this function bears the name of Riemann as a recognition of his deep 
results relating the distribution of prime numbers and the properties of the exten- 
sion of ¢ to the complex domain C\{1}. 


6.9 The Banach Spaces C,(M) and C(K) 


Let K be acompact metric space (in particular, K may be any compact interval or any 
compact disc). The set C(K), of all complex-valued continuous functions on K, is a 
linear space with respect to the pointwise operations of addition and multiplication 
by complex scalars. Since every continuous function on a compact space is bounded, 
this space can be endowed with the sup-norm (or the norm of index infinity), 


If lloo = sup | f(x)|. 
xeK 


Since the convergence associated to this norm is the uniform convergence (easy 
to check), we also call ||-||,, the norm of uniform convergence. 

C(K) is a Banach space with respect to the norm ||-||,, , referred to as the Banach 
space of continuous functions on K. 
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6.9.1 Completeness Theorem for C(K) /f K is a compact metric space, then every 
Cauchy sequence in C(K) is convergent. 


Proof Let (fn)n be a Cauchy sequence in C(K). Then, for every « > 0, there is a 
natural number N such that for all m,n > N, we have 


sup | fin(x) — fn(x)| <. 
xeK 


In particular, for x € K arbitrarily fixed, 


lfm(x) — fn)| <€ (6.3) 
whenever m,n > N. This shows that (fy; (x))n is a Cauchy sequence in C for each 


x. Since C is complete, the sequence (f;(x))n must be convergent, say to f(x). This 
gives us the pointwise limit f of (fn). Letting m — oo in (6.3), we obtain that 


If@) — fr@)| < € 


for all n > N. Since x was arbitrarily fixed, we get 


sup | f(x) — fu(x)| S€ (6.4) 
xeEkK 


hence (fn)n is uniformly convergent to f. By Theorem6.8.3, f must be continuous. 
Therefore, f € C(K), and (6.4) shows that f, — f inC(K). 


C(K,R), the space of all continuous functions f : K — R, is a real Banach 
space when endowed with the sup-norm. 

If M is a metric space, then a similar argument allows us to consider the Banach 
space C,(M), of all continuous and bounded functions f : M — C, endowed with 
the pointwise algebraic operations and the sup-norm. The proof of the completeness 
of Cp(M) is similar to that of Theorem 6.9.1 The real version, C,(M, R), of this 
space, is also a Banach space. 

According to Remark 1.2.9 and the comment after Lemma6.1.6, the space 
Cp(M, R) is an example of linear lattice of functions. 

As a matter of fact, we already encountered these spaces. 

If M is the discrete metric space {1,..., p}, then the Banach space C(M, R) = 
C,(M, R) coincides with the p-dimensional real space R?, when endowed with the 
sup-norm ||-||,5- 

If M is the discrete metric space N, then the Banach space Cp(M) coincides with 
the Banach space €°(N) of all bounded sequences of complex numbers. 

Consider now the one-point compactification N=Nu {co} of the metric space 
N, endowed with the metric induced by R. According to Exercise 4, Sect.5.5, N is 
a compact metric space that contains N as a dense subset. The elements of C (N ) are 
precisely the convergent sequences x = x(n) of complex numbers, extended at oo 
via the formula 
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x(oo) = lim x(n). 
n—->oo 


6.9.2 Remark (The embedding of metric spaces) The consideration of spaces 
C»(M, R) leads us to the interesting conclusion that any metric space M is iso- 
metric to a subset of a suitable Banach space. Indeed, for an arbitrarily fixed element 
a of M, we may consider the function 


T :M— C)(M,R) 


which associates to each x € M the function T(x) € C,(M,R) defined by 
T (x)(z) = d(x, z) — d(z, a). The boundedness of T (x) follows from the inequality 
|d(x, z) — d(z, a)| < d(x, a). The same argument shows that 


| T(x) — T(y)|l = dQ, y) for all x, y € M, 
that is, T is an isometry. 


Exercises 


1. Prove that in C(K) (for K a compact metric space) the convergence in the sup- 
norm coincides with the uniform convergence. 

2. Suppose that (f,)n is a sequence of Lipschitz functions in C([0, 1]) whose Lip- 
schitz constants do not exceed a number L. Prove that if (fn )n converges pointwise 
to a function f, then this convergence must be uniform. 

3. Prove that the Banach space C([0, 1]) is separable. 


6.10 The Fundamental Theorem of Algebra 


The complex field C has the remarkable property of being algebraically closed, that 
is, every algebraic equation of order n > 1 with complex coefficients has at least one 
root in C. The proof of this result is based on the following lemma which describes 
the behavior at infinity of a polynomial function. 


6.10.1 Lemma Let P(z) be a polynomial function with complex coefficients, of 
degree n > |. Then, for every A > 0, there exists a positive number R such that 


lz} => R implies |P(z)| >A. 


Proof Suppose that P(z) = agz” +a,z"—~!+---+ a, with ag 4 0, Then, for z 0, 


a a. 
P(z) = az" (1+ BE Said “) 
agz agz 


and the parenthesis has limit | as |z] — oo. This yields a number R; > 0 such that 
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ay 
14+ — +--+ 


1 
— forall z € C with |z| > R. 
aoz aoz" 2 


Consequently, for |z| > R > R1, we have 


1 1 
|[P@|= 5 |aoz"| = 5 |ao|R" 


I/n_ 


and the assertion of our lemma follows for R > (2A/|ao|) 


The following result shows us that every point at which | P(z)| attains a minimum 
is necessarily a root of the equation P(z) = 0: 


6.10.2 Fundamental Theorem of Algebra (d’Alembert-Gauss Theorem) Every 
polynomial function P(z) = agz” + ajz"~! +--+ + dn, of degree n > 1, with 


complex coefficients, has at least one root in C. 


Proof Suppose that P(z) has no roots. According to Lemma6.10.1, there exists 
R > O such that | P(z)| => 1 +|P(0)| for every z € C with |z| > R. Therefore, 


A= inf |P(z)| = inf |P(z)|. 
zeC Iz|<R 
By Theorem 6.6.3, | P(z)| attains its infimum on the compact disc Dr (0) at a point 
zo, that proves that A > 0. Changing P(z) by AP(z — zo), fora suitable \ ¢ C\{0}, 
we may assume that zo = 0 and P(O) = 1. In this case, P(z) is of the form 
P(z) = doz" + +++ + bn-az* +1 


where bo 4 0 and by_x 4 0. Necessarily, k <n, since otherwise P(z) = boz” + 1 
and any solution a of the equation z” = —1/bo would be a root of P(z). Therefore, 


P(z) = 21 Q(z) + bn-ez* + 1 
for a suitable polynomial Q(z) (not identically zero). Because Q(z) is continuous, 
for any solution w of the equation z* = —1/byj_,, there exists t with 0 < ft < 1 such 
that t|w*t+! O(tw)| < 1. Now 
= k+1 k _ k+1 k 
P(tw) = (tw)""" O(tw) + by_¢(tw)* + 1 = (tw)""" O(tw) + 1—-2"~" 
where 


(Pia 20a? a le ote ea 1a ae = 1, 


This contradicts the fact that the infimum of | P(z)| is 1. Therefore, our assumption 
that P(z) has no root is false. 
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The remainder theorem for polynomials, asserts that given two polynomials P (z) 
and Q(z) (Q(z) of degree at least 1), there exists (and are unique) two other poly- 
nomials S(z) and R(z) such that 


P(z) = Q(z) - S(z) + R(z) 


and the degree of R(z) is less than the degree of Q(z). 

In the case where Q(z) = z — 21, the degree of the remainder R(z) is 0, and 
thus it is a constant, more precisely, R(z) = P(z). In this way, if z; is a root of the 
polynomial P(z), then P(z) admits the factorization 


P(z) = (Z — 21): S(2). 
Continuing this procedure with S(z), we eventually arrive at the factorization 
P(z) = ag(z — 21) ---(@— Zn); 


where ao is a nonzero constant (the coefficient of the monomial of greatest degree 
of P(z)), and z;,...,Z, are the roots of P(z). Counting only distinct roots, the 
polynomial P(z) can be represented as 


P(z) = ag(z — 21)" +++ (Z — Zp)”, 


where the integer numbers 7, > O (the multiplicities of the roots z,) verify the 
equality 


ri +++++1rp = degree of P(z). 


We call the roots of multiplicity 1, simple roots. In the case of polynomials with real 
coefficients, the multiplicity of a real root can be easily computed via the differential 
calculus. 


6.10.3 The decomposition of a rational function into simple ratios Let P(z) and 
Q(z) be two polynomials with complex coefficients and suppose that Q(z) can be 
represented as the product of two nonconstant polynomials Q,(z) and Q2(z), which 
have no common root. Then, the greatest common divisor of Q1(z) and Q2(z) is I. 
By the division algorithm for polynomials, there exist two polynomials Sj (z) and 
S>(z) such that 

Q1(Z)S1(Z) + O2(z)S2(z) = 1. 


Consequently, one has the decomposition 


P(z) Ss P(z)S2(z)_  P(z)Si(z) 
Q(z) QO; (z) Q(z) — 
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By iterating the above argument, we get 


PQ P(z) eS Plz) 
“(2 — Zp)? - 2 Gar 


Q(z) ag(z— 21)" - (z — zy’ 


while expanding the polynomials P; (z) in powers of z—z,;, one arrive at the following 
decomposition of a rational function in a polynomial part and simple ratios, 


P(Z) 
99 ODD ty 


si 


here, S(z) is a polynomial and A ;, are complex numbers. Moreover, this decompo- 
sition is unique. 

When dealing with quotients of real polynomials, we have to notice that if Q(x) 
is a polynomial with real coefficients which has a complex root z} = x1 +iy,, then 
it has also the conjugate root, Zz; = x; — iy. By dividing Q(x) by 


(x —a)Q@ —Z1) =@—-m)* +, 


we get a polynomial with real coefficients, so by repeating the argument, we conclude 
that Q(x) admits the following factorizations into factors with real coefficients, 


Q(x) = aol(x — x1)? + yZT! - ++ Ee — x0)? + yp] (x — wept + (x — xp)", 


where xj tiyy,..., xg + fye are its distinct roots inC \ R, and x41, ..., Xp are its 
distinct real roots. 

Moreover, for every two polynomials P(x) and Q(x), with real coefficients, the 
following decomposition holds: 


P(x) “ Akm + x Bem r ot Akm 
ow °° +> Cas ee cee 


k=1m=1 


Here S(x) is a polynomial with real coefficients, and Ag and Bym are real numbers. 
Exercises 


1. Prove that every polynomial function of odd degree, with real coefficients, has 
at least one real root. 

2. Let P(x) = x” — ayx"~! — .-- — ay be a polynomial function such that the 
numbers a,x are all nonnegative and at least one of them is nonzero. Prove that 
P(x) has a unique positive root p, that root is simple and the absolute values of 
all other roots do not exceed p. 
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3. Find all polynomial functions P(x) with real coefficients such that P(0) = 0 
and P(f(x)) = f(P(x)) for a certain function f : R > R with f(x) > x for 
allx ER. 

[Hint: Show that P(x) — x has infinitely many roots, so it is constantly zero. ] 


6.11 Notes and Remarks 


Mathematical analysis outlines not only the different structured sets, but also the 
connections that can be established between them by functions that conserve these 
structures. Thus, an important object of study in the topological theory of metric 
spaces is the continuity of functions relating the different metric spaces. 

The surjective isometries allow us to identify the metric spaces from the metric 
point of view, while the homeomorphisms allow such an identification only from the 
topological point of view. 

The continuous functions f : R — R are characterized by the property that the 
inverse images under f of open sets are open. This property can be used to introduce 
the notion of continuos function in the general framework of topological spaces. If S$ 
and T are topological spaces, a function f : S — T is continuous if for every open 
subset U of T, the set f~!(U) is open in S. 

Does there exist a family ¥ of sets of real numbers such that a function f : R—> R 
is continuous if and only if it maps every set X € F intoaset f(X) € F? The answer 
is "no". For details, see Velleman [4]. 

Heine’s characterization of continuity has a companion in terms of slowly oscil- 
lating sequences. A sequence (x,), of points in R is called slowly oscillating if 
Xn — Xm — O when x —> 1,n > m — ov. Clearly, every Cauchy sequence is 
slowly oscillating. The sequence 


n+l 
x =1, Xone = > fk =0,1,...,2"41- 1, neN,n>1 
j=l 


is slowly oscillating but not Cauchy. One can prove easily that every function f : 
IR > R which maps slowly oscillating sequences into slowly oscillating sequences 
is continuous. Moreover, this mapping property is characteristic for the uniformly 
continuous functions. See the paper of Vallin [5]. 

The lower semicontinuous functions f : E — RU {oo} appear as “usual” con- 
tinuous functions provided that E is endowed with the topology associated to the 
metric of E and R U {oo} is endowed with the upper topology, generated by the sets 
(a, oo] witha ER. 

The intermediate value property appears to be quite different from continuity. 
Indeed, there are everywhere discontinuous functions f : [0, 1] — [0, 1] that attain 
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every value in [0, 1] in every nondegenerate subinterval of [0, 1]. The following 
example was indicated by Lebesgue [6], p. 97. Ifx € [0, 1) has the decimal expansion 


x = 0.did2d3... 


and the sequence d, d3, ds, .. . of decimals of odd index is periodic from some index 
onwards, we put 


f(x) = 0.d2nd2n42d2n+4 tee 


provided that 27 — 1 is the least index such that don—1, d2n41, d2n+3, ... iS periodic. 
Otherwise, we put 


f(x) =0. 


Then, f maps every open subinterval (a, 3) C [0, 1] onto [0, 1]. 

One can construct examples as above with the additional feature that f = 
outside a thin subset (more precisely, outside a Lebesgue null set in the sense of 
Definition 9.6.1). For details, see Problem 10861, Amer. Math. Monthly, 109 (2002), 
p. 923. 

Neuser and Wayment [7] gave an example which shows that the sum of a con- 
tinuous function f : [0, 1] > R with a function g : [0, 1] — R with intermediate 
value property need not have the intermediate value property. 

On the other hand, Sierpinski [8] has shown that every function f (from R into 
itself) is the sum of two functions having the intermediate value property. 

The distinction between pointwise convergence and uniform convergence was 
first recognized by Karl Theodor Wilhelm Weierstrass. 

The convergence speed of sequences of continuous functions may be very small. 
As was noticed by Benyamini [3], there is an uniformly bounded sequence of strictly 
positive continuous functions (f;,), on [0, 1] with the following two properties: 

(a) fn(x) — 0 for every x € [0, 1]; and 

(b) for each unbounded sequence o )n Of positive numbers, there is a point 
x € [0, 1] at which eis mst In(x) = 


The functions, Whicha are pointwise limits of continuous functions, are said to be 
of Baire class one. As shows the following result, the points of continuity of every 
function f : R > R of Baire class one form a residual set. 


6.11.1 Theorem (René-Louis Baire [9]) Let T and M be two metric spaces. Suppose 
that f and fo, fi, f2,... are functions from T to M such that each fy is continuous 
and lim f,(t) = f(t) for eacht € T. Let 

noo 


A=()\ {J int Ln [rer samo. <4] 


n=m 
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Then: 


(a) A isa G5 set and f is continuous at each point of A; 

(b) T\A is a set of first Baire category in T; 

(c) if T is of the second Baire category, then the set of points of continuity of f is 
dense in T; 

(d) if Vis an open set in M, we have 


co 6«CO COO 


f'W=UUN {rer d (full), eae & 


k=1 m=1 n=m 


and thus f—~! (V) is an F, set. 


The details can be filled out easily. For (b), notice first that for every sequence 
(An)n Of subsets of T, we have 


(| An ( OE (atom) | 
nel n=) n=l 


The characteristic function of any closed and bounded set A C R is a Baire class 
one function. In fact, 


: 1 
XA(x) = im, Tend forallx ER. 


It is easy to see that every real-valued function f defined on an interval J that 
has finitely many discontinuities is a Baire class one function. In fact, if a1,..., dn 
are the discontinuities of f, and N € N is large enough, we may replace f on each 
interval [ax — 1/N, ax + 1/N]T by linear interpolation and the functions fy so 
obtained are continuous. Clearly, fy — f pointwise. 

As an application, if x = 0. di (x)d2(x)d3(x) ... denotes the decimal expansion 
of x € [0, 1], then d(x), do(x), d3(x), ... are Baire class one functions. Indeed, the 
function d,, is constant on each interval (4 07 it) , fork =0,..., 10” —1 and thus, 
it has finitely many discontinuities. 

A topological characterization of Baire class one functions on R is as follows: 


6.11.2 Theorem (René-Louis Baire [9]) A function f : R > R is of Baire class 
one if and only if the set of discontinuities of any restriction of f to a closed subset is 
a Set of first Baire category. 


See Gordon [10], p. 78, for details. 

Every real-valued semicontinuous function defined on an interval is of Baire class 
one. The signum function is neither upper semicontinuous nor lower semicontinuous, 
but it is of Baire class one (being connected to the pointwise limit of the sequence 
of continuous functions f, (x) = arctan(7x)). 
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An example of function, which is not of Baire class one, is xq. This function is 
a pointwise limit of Baire class one functions: 


xg) = lim { lim [cos(m!mx)]?”} for allx ER. 


m—->oo n->Oo 

The pointwise limits of Baire class one functions are called Baire class two func- 
tions. In general, the Baire class n functions are all functions which are the pointwise 
limit of a sequence of functions of Baire class less than n. One can show that each 
Baire class is strictly larger than the preceding one. See the classical book of Isidor 
Natanson [11], vol. 2, Chap. 15. 

The Fundamental Theorem of Algebra is still an object of reflection and journals 
like Amer. Math. Monthly continue to publish new proofs and generalizations. See 
also the monograph of Fine and Rosenberger [12]. 

Sometimes, this theorem leads to unexpected elegant results like the nonexistence 
of a decent multiplication in R*. We know on the existence of a field structure on 
R, R2 and R* (the latter case being a celebrated construction due to William Rowan 
Hamilton) but on the other powers of IR, we cannot even complete the linear structure 
to a structure of associative algebra such that the equations ax = b admit unique 
solutions for every a £ 0 and every b. The solvability of these equations translates 
into the fact that the left multiplication by a 4 0, 


La: R’ > R", La(x) = ax, 


is a linear isomorphism, so that det Lg # 0. 

Or, for n = 3 anda = (1, f, 0) (with ¢ a real parameter), this means that a third 
degree polynomial with real coefficients, det L,, has no real root, a contradiction. 
The argument also works for all odd powers n > 1. 

It is worth noticing that the Fundamental Theorem of Algebra is equivalent to the 
existence of eigenvalues of any n x n dimensional matrix with complex coefficients. 
Indeed, any polynomial P(z) = z”+a,z"~!+- - -+-ay is the characteristic polynomial 
of a matrix, more precisely, of the Frobenius matrix 


Oo 1 0O.. 0 
0 oO 1 0 
0 0 O 1 
a, —da2 —a3 an 
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Chapter 7 
Elementary Functions 


The purpose of this chapter is to give a self-contained presentation of the so called 
elementary functions, such as the exponential, the logarithm, the power function, 
the sine and the cosine functions, and so on. An important role in our approach is 
played by the functional equations and the power series. 


7.1 The Exponential 


The rational powers of positive numbers have been introduced in Sect. 1.3 as a con- 
sequence of Completeness Axiom. The reader should notice the similarity with the 
argument given in Sect. 6.4, where the Intermediate Value Theorem took the central 
role. 

The possibility to define all powers with real exponent is motivated by the fol- 
lowing theorem, based entirely on Completeness Axiom and some of its direct con- 
sequences: 


7.1.1 Theorem /f a is a real number, a > 1, then there exists a unique 
function f : R — R such that: 


(a) f(x +y) = f(x) f) for allx,y eR; 
(b) fC) =a; 


(c) The function f is strictly increasing. 


The function f, so defined, is known as the exponential function of basis a, 
and denoted exp, or a*. In the particular case when a = e, it is simply called the 
exponential function and denoted exp or e*. 


Proof We will show first that every function that verifies the conditions (a), (b), and 
(c) is defined in a unique way. 
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Indeed, by (a) we get f(0) = fOy, so that f(O) € {0, 1}. If f(O) = 0, then 
from (a) we infer that 


f(x) = fx +0) = f(x) fO) =0 forallx ER, 
a fact which contradicts (c). Therefore, f(0) = 1 and an easy induction argument 
(based on (a) and (b)) shows that f(p) = a?, for all p € N. By taking into account 
that 1 = f(0) = f(x —x) = f(x) f(—*+), for all x € R, we infer that 
f(—x) = f(x)7! for all x € R, 
whence f(p) = a? for all p € Z. 
According to Archimedean Property, for each x € R and each natural number 


n > 1, there is a (unique) integer m such that m < nx < m-+ 1. Since the function 
f is strictly increasing, this leads to 


a” = f(m) < f(nx) = f(x)" < f(m+ 1) =a™"!, 


and thus 
qiln < f(x) < qgQ@tt)/n 


For nz = 2*, k > 1, denote by m, the corresponding value of m. Then 
2m < 2X = Ng41xX < 2(m_ + 1), 
from which it follows that 2m; < m4, and mp4, + 1 < 2(m,z + 1). Therefore, 


Mr 2m, 2 M41 and Mert + 1 2 2(mz + 1) 2 me + a 


Nk 2nk ~ Nk+1 Nkt+l 2nk Nk 


Denoting 
h= jon are ; 


it follows that f(x) belongs to the intersection of a decreasing sequence of intervals, 
whose lengths verify 


LU) = qitk/Mk (al/" _ 1) 2 qutb/n (ali" 7 1) eS a—l qomtb/ny 
~ ne-1 


See Exercise 4 at the end of Sect. 1.3. According to the Nested Intervals Lemma, 
Ff (x) is the unique point in the intersection of all intervals J;. 

To finish the proof, we need to check that the function f determined as above 
satisfies the conditions (a)-(c). 
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Let x and y be two arbitrary real numbers. For eachk € N, let us consider integers 
m x and px such that 


ME <ngx < my + 1 and pr < ney < pet 1, 
where n; = 2*. Then mg + Dk <ng(x + y) < me + pe +2 and 
qQitk/Nk < f(x) < git) /nk — gPk/Nk < fy) < Pet /nk 
from which we infer that 


QQ ikt Pr) / Nk < fixty)< qVet PK+2)/Nk 


and 


qinkt Pk)/ Nk < f(x) f(y) < QinktPRt2)/Nk | 
Therefore, 


[f(x +y) — FO)FO)| < alet P/M (g2/Mk — 1) 
< gh +Pit2)/n (g2/Mk _ i, 


and because the right-hand side approaches 0 as k —> oo, this shows that the formula 
(a) holds. 

By our definition, f(1) = a and f(x) > 0, for all x € R. Given z > 0, there is a 
natural number k such that ng = 2* verifies nzz > 1. Then, mg = [ngz| > 1, which 
implies f(z) > a’"*/"* > a!/"* > 1, Therefore, for y > x we get 


FO) = FY — x) FG) > FR), 


that is, the function f is strictly increasing. 


The exponential function can also be defined for a base a € (0, 1) via the formula 


When a = 1, we put 
1*=1 forallx ER. 


The graph of the function a* is sketched in Fig. 7.1. 
The following algebraic properties of the exponential are immediate consequences 
of Theorem 7.1.1: 


axtY =q*.a); 


a= l:al=a; 
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Fig. 7.1 The graph of the 


: y 
function a* A 
0<a<l a>l 
1 
+ > 
0 x 


If x < yanda > 1, thena® <a’; 
If x < yand0 <a < 1, thena* >a’; 
a * =1/a"*. 


The behavior at infinity of the exponential function is as follows: 


ay gta ll if0<a<1 
X—>—0O ~~ 0) ifa > 1 

fin ae = 0 if0<a<1l 

X00 ~ co ifa>l. 


For a proof, notice first that 
a" >na foralla>landalln EN, 


which can be easily checked using induction. Since the exponential function of base 
a > 1 is increasing, this inequality implies the last of the four limits indicated above. 
The other ones follow from it by algebraic operations. 


7.1.2 Proposition The exponential function is continuous on R. 


Proof We will only prove the continuity at the origin. The general case is an imme- 
diate consequence of the formula 


a®~ — a = a (q* * — 1). 


As a!/" -+ 1, for each ¢ > O, there is a natural number N > 1 such that 
la!/" — 1| <e, for alln > N. Assuming that a > 1, we have 


ae e@aq’™ tx =1N, 
so that 


la*—1] < max {|a~'/ =i eee i} < max fea“"/", e} < ea 
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for all |x| < 1/N, which proves the continuity of the exponential function of base a 
at 0. The case when 0 < a < | can be treated similarly. O 


Taking into account the behavior at infinity of the exponential and the Intermediate 
Value Theorem, we obtain that the image of the function exp, (fora > 0, a € 1) is 
(0, co). 


7.1.3 Proposition (The exponential function and the number e) Consider the func- 
tion f(x) = (A+ +), defined for x € (—oo, —1) U (0, o&). Then: 


(a) lim (1++)* =e; 
X—0oO 

(b) lim (1+4)* =e. 
X—>-—0O 


Proof (a) For each real number x > 1, we have: 


I x 1 [xJ+1 1 Lx] 1 
een eed pee S(t oman 
(1+ 5) = (1+ 55) (1+ 5) (1+ 55) 

1\* 1 Lx] 1 LxJ+1 1 
Oa) = ea) ‘ee 


Clearly, |x| —> oo as x — ov, so the left-hand sides of both inequalities above 
tend to e. 
=) 
(b) Substituting y = —x, we have to prove that lim (1 - 1) = e. This 
yoo 2 


follows from the assertion (a) since 


(-5)°=()-( ay (+45) 
— = — = — ——_ J > e 
y y-1 y-1 y-1 


as y > Oo. 


An immediate consequence of Propositions 7.1.2 and 7.1.3 is the following for- 
mula for the exponential function 


n 
ev = lim (1+) (7.1) 
n—>0o n 
- C x Y * ‘7 1 
reel eee) = ee) ) =e) ) ee 


formula agrees for x = 1 with the formula by which the number e was defined in 
Sect. 2.3. Drawing a parallel to the theory developed in Sect. 2.3, we can also define 
the exponential function as the sum of a series of powers of x : 


CO 
> eens en ee forx ER. (7.2) 
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The absolute convergence of this series was already noticed in Remark 4.2.9. By 


CO 
denoting S(x) = >’ 3, and taking into account Mertens’ Theorem on the product 
n=0 — 
of absolutely convergent series, we get 


so-sor=( =\(> *) 
n=0 — n=0 — 
ae yy 
=(145+ 54) (4t+54) 


x x? X 2 
=1+/( +t)+($+ PV 


1! 2! 1! 2! 
a area ene 
An Gp A 
(oe) 
x+y)" 
=> SY Lsesy, 
n=0 Nn. 


for all x,y € R. Thus, S(x) verifies the functional equation mentioned in 
Theorem 7.1.1(a). 
Moreover, S(1) = e and 


0 <x < yimplies | = S(O) < S(x) < S(y). 


Since S(—x) = S(x)~!, we also have S(x) < S(y) for x < y < 0. From 
Proposition 4.3.3 we infer that S(x) < 1 if x € (—1, 0). Therefore, S(x) is strictly 
increasing. Taking into account Theorem 7.1.1, we conclude that $(x) = e* for every 
x eR. 

An easy consequence of the series expansion of the exponential (7.2) is the inequal- 


ity 
e* —1—x|<x*(e—2) for |x| <1, 
which yields the asymptotic formula 
e“=1+x+o0(x) asx > 0. 


We end this section with a remark concerning the computation of the values 
of the exponential function using a PC. In principle, a PC is able to perform such 
computations with any precision, but without any idea about what are we computing, 
we may be led to wrong conclusions. For example, exp(7/163) is a number very 
close to an integer and all PC’s will give you 
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262 5 374 12 640 768 7 44 


within an absolute error less than 10~!°, so we may think of exp(z/163) as an 
integer. Or, it was noticed by Ch. Hermite in 1859, that this number is irrational and 
its exact value with 32 digits is 


exp(z Vv 163) = 2625 374 126 407 687 43, 9 999 999 999 9925... . 


12 times 
Exercises 


1. Prove that e* > 1+ x, for all x € R, with equality only for x = 0. Infer that 
e* <1/(1 — x), for all x € (0, 1). 
[Hint: See the series expansion of the exponential (Formula (7.2)).] 

2. Let (ay)n and (b,), be two real sequences such that a, + by, — O and e% + 
e’n —> 2. Prove that (ay)n and (by), converge to 0. 


3. Let g be a positive integer, m = cos ra +i sin on and letr € {0,1,...,qg—1}. 
Prove that 
ak ay “jkr 
n=r modq ne q k=0 - 
for allx ER. 


[Hint: Notice that S = 1+ ok +.--+ @49-* = q when k = 0 modg, and 0 
otherwise. | 


7.2 The Logarithm and the Power Function 


In the preceding section, we showed that the exponential function exp, (of basis 
a > 0, a # 1) establishes a bijection from R onto (0, oo). Its inverse, 


log, : (0,co) > R, 


is called the logarithmic function of basis a. 

The natural choice for the basis is e, in which case the logarithmic function is 
denoted log or In. 

Notice the formulas 


ai°&a* =x, forall x € (0, 00) 


Se 


log,a =x, forallx eR. 


Being the inverse of a monotone continuous function acting on intervals, the 
logarithmic function is also monotone and continuous. See Theorem 6.4.6. 
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Fig. 7.2. The graph of log, y 


0<a<l 


The algebraic properties of the logarithmic function are summarized by the for- 
mulas below: 


log, xy = log, x + log, y; 

log, 1 = O and log, a = 1; 

If0 <a < land0 <x < y, thenlog, x > log, y; 
Ifa > 1 and0 <x < y, then log, x < log, y; 
log, (1/x) = — log, x; 

log) jg X = — log, x. 


They all are easy consequences of the similar properties of the function exp,. 
For example, in order to prove the first formula, we may assume (since exp, is a 
bijection) that x = a® and y = a” for suitable €, n € R. Then xy = a® -a” = a+” 
and thus, log, xy = &€ +n = log, x + log, y. 

The behavior of the logarithmic function log, at the endpoints of its domain of 
definition can also be deduced from the behavior at infinity of the exponential: 


lim log, x = 
x—>0+ Ga 


co if0<a<l 
—oo ifa>1l 


—oo if0<a<l 
co ifa> 1. 


The graph of the logarithmic function is shown in Fig. 7.2. 
7.2.1 Lemma For everya > 0,a #41, b > Oand every x € R, we have 
log, b* = x log, b. 


Proof By replacing a by |/a (if necessary), we may assume that a > 1. Then 
the function f(x) = a* loga> fulfills the conditions (a)-(c) in Theorem7.1.1 (with 
fC) = db). Consequently, f(x) = b*, and thus log, b* = x log, b. 


7.2.2 Corollary (The change of basis) For everya > 0,a 4 1,b>0,b #41 and 
x > 0, we have 
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log, x 


1 = : 
eee log, b 


Proof Replace x by log, x in Lemma7.2.1. 


Since the logarithmic function is injective, an equality of the form 


is possible only for x = y. This yields a very convenient procedure to check identities 
for exponential, such as 


@’ i" — a’. 


7.2.3 Theorem For every a > 0, a 4 1, we have: 


] 1 
(a) lim log, (1 +) = 1/loga; 
x0 x 
aX — 
(b) lim = loga. 
x0 x 


Proof (a) Notice that 180+) = log,( + x)!/* and use the continuity of the 
logarithmic function and formula (a) in Proposition pele 

(b) Put y = a* — 1. Then x = log, (1 + y) and wat = y/log,(1 + y). Since 
y — 0 as x approaches 0, the result follows from part (a). 


For a = e, the statement of Theorem 7.2.3 can be rewritten as 


logd +x) =x+x-o(x) 
e=14+x+x-o(x) 


for x —> 0. 
The power function (of exponent a € R) is defined on (0, 0c), by the formula 


1° = 1 and x“ = exp, a for x € (0, c0)\{1}. 
Alternatively, it can be introduced via the logarithmic function: 
x? = (p18 i = pt log, x 


for all b > 0, b ¥ 1, and all x > 0. This implies immediately several properties 
played by the power function. The power function is continuous, monotone, and 
multiplicative, 

(xy)* = x". y*, 


forallx, y > Oandalla € R. The same argument allows us to determine the behavior 
of the power function at the endpoints of its domain of definition. See Fig. 7.3. 
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Fig. 7.3. The graph of the y 
power function x“ \ 


7.2.4 Theorem For every a € R, we have 
(+x) =1t+ax+x-o0(x) asx > 0. 


In particular, 
. (+x)? -1 
lim ——————— =a 
x0 x 


Proof According to the asymptotic behavior of the logarithmic function and of the 
exponential, we get 


( +x)4 = eft log(1+x) =f 


= elk tx-0(X) ax yxo(x) —]1 


—l=e 
=(l+ax+x-o(x))(1+x-o(x))- 1 


=ax+x-o(x), 


and thus, =a+o(x)asx > 0. 
A direct computation of the limit when a ¥ 0 is based by the substitution 
(1+.x)* =e’. Then x = e/4 — 1, so that 


(+x)?-1 
x 


ey-1 


. U+x)%-1 . ey —1 . y 

lin. ————_ = lim ——— =a lim ——— =a. 

x0 x y>0 ey/a —] y>0 a 
y/a 


7.2.5 Remark (Functional equations) One can introduce the logarithmic function in 
the same manner as the exponential, via the following companion of Theorem 7.1.1: 
If a > 1 is a real number, then there exists a unique function f: (0,00) > R such 
that: 


(a) f(xy) = f(x) + fF) for all x, y € (0, ov); 
(b) f(a) =1; 


(c) The function f is strictly increasing. 
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This function f is precisely the function log, considered above. The logarithmic 
function can be defined for bases a € (0, 1) by the formula 


log, x = — logy ;,x 


Once the logarithmic function is defined, the exponential can be introduced as its 
inverse. 

The functional equations defining the exponential and logarithm are part of a 
well developed theory that starts with Cauchy’s functional equation, nothing but the 
condition of additivity of a function f: RR — R: 


fx+y)=f@)+f0) forallx,y eR. 


7.2.6 Theorem Every continuous (or monotone) function f: R — R that verifies 
Cauchy’s functional equation is of the form f (x) = ax for a suitable real constant a. 


Proof Put a = f (1). Using the mathematical induction one can show that 
f(a) = na, for alln € N. Since 0 = f(n + (—n)) = f(r) + fn), this 


equality extends to negative numbers. If r is a rational number of the form r = 7, 


with m € Zandn € N*, then ma = f (m) = f (n- “) =nf (“) , from which we 
get that f (r) = ra. The proof ends by using the approximation of real numbers via 


monotone sequences of rational numbers. 


An extension of Theorem 7.2.6 will make the object of Exercise 14, at the end of 
Sect. 11.5. See also the Notes and Remarks at the end of this chapter. 


Exercises 


1. Leta > 1 and p > O. Prove the formulas: 


es _ logx 
lim —=0, lm 


x00 q* x—>0o xP 


=0, lim x?logx =0. 
x—>0+ 


2. Formulate the two inequalities in Exercise 1, Sect.7.1, in terms of logarithms. 

Infer that 1 1 
al < log(n + 1) — logn < 7 

for all integers n > 1. 

3. Prove that x” + y* > 1 for every (x, y) € [0, 1/7 } \ £0, 0)}. 

4. (A proof of Young’s Inequality by using density of Q). 
(a) Prove that (m + n) x" < mx?" +n, for all natural numbers m and n and all 
nonnegative real numbers x. 
(b) Infer that s’"t” < =" s Ald een aoe for all natural numbers m and n with 
m+n > 0 and all nonnegative ae numbers sand t. 
(c) Suppose that p,q > 1 and 1/p+1/q = 1. Use (b) and the density of Qin R 
to conclude that 
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xP y4 
xy < rs + a for all x, y € [0, 00). 


5. Let f: R — R be an additive function, continuous at a point x = x9. Prove that 
f is continuous everywhere (and thus it is of the form f(x) = ax). 

6. Determine all continuous positive functions f, g, h: R — R that verify the func- 
tional equation /f (x)g() = h (72). 

7. Prove that the power functions x“ (for a € R) are the only solutions of the 
functional equation 


fay) = fx) fy) forallx, y > 0, 


which are continuous and nonidentically zero. 


7.3 Power Series 


A power series is a series of complex functions of the form 


Dien (@ = 20)" = co + €1 (Z = 20) $2 (Z = Zo)? Fo, 


n>0 


The numbers c,, are called the coefficients of the series, and the number Zp is called 
the center. 

The set of convergence of a power series is always nonempty as it contains at least 
the center. In the particular case of the geometric series >" ,,..9 z”, this set is the unit 
disk D;(0), a disk whose center is precisely the center of the series under attention. 
As shows the following theorem, this remark extends to all power series: 


7.3.1 First Abel’s Theorem Let >°,,.. Cn (Z — Zo)" be a power series. 
(a) If the series converges at z = z, and z, # zo, then it converges uniformly and 
absolutely on every compact disk 


D, (zo) = {2 €C :|z—zol <r}, 


where 0 <r < |z1 — Zol. 
(b) If the series is not absolutely convergent for z = z1, then it does not converge 
for any of the points z such that |z — zo| > |z1 — Zol. 


Proof (a) Since the series paper Cn (21 — zo)” is convergent, there is M > O such 
that |cn (z1 — zo)"| < M, for all n > 0. Let r € [0, |z1 — zol) and let z € D, (zo). 


Then 
n n Iz — zol ‘ r i 
len @ — 20)" | = [en (er - 20)"|- (7) SM 
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and since r/|z1 — zo| < 1, Weierstrass’ M-Test implies the conclusion. 

(b) Suppose that there is a point z2 such that |z2 — zo] > |z1 — zo| and the 
series is convergent for z = z2. According to part (a), series should be absolutely 
and uniformly convergent on the disk D, (zo), where r = |z1 — zo|, which is a 
contradiction. 


We define the radius of convergence of a power series >" ,..9 Cn (Z — Zo)” by the 
formula = 


R = sup 4 |z — Zo| : the series >» Cn (Z — zo)” is absolutely convergent 


n>0 
and we call the open disk 
Dr (Zo) = {2 € C: |z — Zo| < R} 


the disk of convergence of the series. Notice that R € [0, oo]. 
The two statements in First Abel’s Theorem can be reformulated as follows: 
(a) The series one Cn (Z — Zo)” is absolutely and uniformly convergent on every 


compact disk D, (zo) included in the disk of convergence. 

(b) The series is not convergent at any point outside Dp (zo). 

Therefore, the set of convergence of a power series consists of all points of the 
disk of convergence Dr (zo) and some of the points on the circumference of this 
disk. 

Since 

Dr(wo)= (J Dd, &o), 


O<r<R 


it follows that every power series is absolutely convergent on its disk of convergence. 
However, a power series may not be uniformly convergent on its disk of convergence. 
See the case of the series }°,,.9(1 — z)z”. 

7.3.2 Theorem The sum of a power series is a continuous function on its disk of 
convergence. 


Proof Let >°,,59 Cn (Z — Zo)” be a power series with sum S(z) and radius of con- 
vergence R > 0. We will show that the function S(z) is continuous at every point 
z1 € Dr (zo). For this, let r be such that 0 < |z; — zo| < r < R. Since the partial 
sums of a power series are polynomials, and the partial sums converge to S(z) uni- 
formly on the compact disk D,-(zo), it follows from Theorem 6.8.3 that the restriction 
of S(z) to D,(zg) is continuous. Or, z; is an interior point of this disk. Therefore, 
S(z) is continuous at z. 


Next, we will address the problem of computing the radius of convergence of a 
power series. 
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7.3.3 Theorem (Cauchy—Hadamard Formula) With the convention 1/0 = oo and 
1/oo = 0, the radius of convergence of a power series ban Cn (Z — 20)” is given 
by the formula 


1 
R = ———_.. 
lim sup¥/|cn| 
n> Co 


Proof According to the Root Test, the series >*,,.9 


Cn (Z — zo)"| is convergent if 


lim supy/|Cy (z — zo)” | < 1 and divergent if lim sup ¥/|cy (z — z9)"” | > 1. 


N—- Oo noo 
For z # zo, limsupy/|cp (z — zo)" | = |z—2Zol - limsup¥/|c,|. Denoting 
noo nc 
p = lim sup ¥|c,|, it follows that for o|z—zo| < 1, the series is absolutely 
n—- oo 


convergent, while for o |z — zo| > 1, the series is not convergent. Therefore, (with 
the convention mentioned in the statement), R = 1/p. 


In the same way, one can prove Abel’s Formula for the radius of convergence of 
a power series >... Cn (Z — Zo)”: 


7 Cn 
R= lim 
n—->oco 


Cnt+1 


if the limit exists. 
At this point, we turn our attention toward power series of the form 


bee (x= 29)", 


n>0 


having real coefficients and a real center. We can develop a parallel theory within R, 
which will describe the set of convergence in the real domain. It is easy to see that 
in this case, the place of disks will be taken by intervals centered at xg. Therefore, 
if we define the radius of convergence in a similar manner, we get exactly the same 
formulas for its computation and instead of a disk of convergence, we have an interval 
of convergence, 


(xo — R, x9 + R) = Dr(xo) OR. 


There is an important practical consequence of this remark, namely, if a real 
function S(x) can be described as the sum of a power series 


S(x) = Do en (& — x0)" 


n=0 


on an interval (xo — R, x9 + R), then the function can be extended to the whole disk 
Dpr(xo) by the formula 
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S(z) = > Cn (Z — x0)" 


n=0 


This kind of extension is called extension by analyticity. The terminology is 
explained in Remark 7.3.5. 

Typically, the elementary functions are initially defined on certain intervals and 
then they are represented (using differential calculus) as sums of power series. 
The above discussion gives us the possibility of a natural extension of these functions 
to the complex domain. 


7.3.4 Example (a) The radius of convergence of the series >’, "2" is 


n” came 
R= lim —"—_ = lim ( ) =0, 
n>00(n + Ly" n>coo\1+1/n n+1 


so that the series is divergent at every point z 4 0. 

(b) The radius of convergence of the series >". | pr on is R = 1, so that it 
converges for every x € (—1, 1). By Lemma4.1.2, we infer that the series is divergent 
at every point x € (—oo, —1) UCI, w). 

The point x = 1, is a point of convergence. See Leibniz’s Test. 

For x = —1, the series becomes >", c iy (-l" = pee ; which we know 
to be divergent. 

Consequently, the convergence set of the given series is the interval (—1, 1]. We 
will show in Sect. 8.5 that 


oo (= = 
x” 
pia x" =log(1+.x) forevery x € (—1, 1]. 


(c) The hypergeometric series of Gauss is presented in the Notes and Remarks at 
the end of this chapter. 


7.3.5 Remark (Analytic functions) A real-valued function defined on an open sub- 
set U of R is called a real analytic function if is locally given by a convergent 
power series. In other words, for every point a € U there exists a power series 
> 20 n(x — a)” and a positive number R such that f(x) = 7°29 cn(x — a)” on 
(a — R,a + R). Most of the usual functions like the exponential, the logarithm and 
the trigonometric functions are real analytic. Some special properties of real analytic 
functions are mentioned in Exercises 6, 7, and 8. In Sect. 8.6 we will prove that these 
functions are indefinitely differentiable and admit locally Taylor series expansions. 


The concept of complex analytic function can be introduced in a similar way (for 
complex-valued functions defined on open subsets of C), but they exhibit stronger 
properties that do not hold generally for real analytic functions. Their theory can be 
found in the books of Lang [1] and Rudin [2]. 
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Exercises 
1. Using the power series expansion — =ltxt+x%4+--- for |x| < 1, 
decompose the function Ta into simple fractions and then expand it in 

power series centered at 0. 

Expand the function f (x) = x? in power series centered at 1. 

Expand the function f(x) = (1 + x) e~* in power series centered at 0. 

Find the radius of convergence of the series >’, re 

Suppose that the real sequence (a,), is bounded but the series >”, dn diverges. 

Prove that the radius of convergence of the power series >, ..9 aan” 18 1, 

6. Suppose that the series >"... Cn(x — a)" (whose center and coefficients are all 
real numbers) is convergent on the interval (a — R,a + R). Prove that its sum 
S(x) is areal analytic function. 

7. (Isolated zeros). Suppose that a real analytic function f, defined on an open 
interval J, vanishes on a convergent sequence of distinct elements of J whose 
limit also belongs to 7. Prove that f is identically 0. 

Therefore, the zeros of a nonzero analytic function are isolated points. 

8. (Analytic Continuation). Suppose that f,: J; — Rand fo: Iz — R are two real 

analytic functions which agree on J; M Iz 4 @. Prove that 


ON a Sok 


; _ fi) ifxe lh 
f:hUboR, f@= ee oF 
is an analytic function and actually the only one that extends f; to 1) U h. 
9. Taking into account that every positive integer k can be uniquely represented as 
k = 2 (2n + 1) for suitable m,n € N, prove that 


[oe) n 
Yee 
er 1 _ zantl 1 —z 


for every complex number z with |z| < 1. 

10. Denote by d(n) the number of positive divisors of the integer number n > 1. 
Prove that: 
(a) d(n) < 2m: 


(b) the Lambert series >° <1 a is convergent inside the unit disc and 


St fone 
ya ee = > diaz" = 7422? +222 + 3z4 +--+. 
n=l n=1 


[Hint: According to the Fundamental Theorem of Arithmetic, every integer 
n > 2has aunique decomposition as a product of prime powers, n = Pi! ss pp, 


In this case, d(n) = [],_) (1 + nx).] 
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7.4 The Exponential and the Trigonometric Functions 


The exponential function is defined in the complex domain by using the extension 
by analyticity of the power series obtained in Sect. 7.1, in the case of real variable: 


foe) 
- Z 2 2 
e=) =14+—+—+-:--, zeC. 


According to Abel’s Formula, the radius of convergence of this series is 00, a fact 
that assures its absolute convergence on C (as well as its uniform convergence on 
every compact disk in C). By Theorem 6.8.3, the exponential function is continuous 
on C. 

The functional equation verified by the exponential function in the real case also 
works in the complex case. 


7.4.1 Lemma We have 
ele? = e!F22 forall zy, z2 EC. 
The exponential function of basis a > 0 is defined by the formula 
a = e'84, 7 EC. 


The functions sin and cos can be defined in the complex domain in a similar way: 


oo (—1)" an+1 a] =) 
j = tt a Pd, Se. coats ; EC 
ae pz Qn+D!° ear? 5 : 
00 2 4 
(-1)" >, Zz z 
ee et es, Bee. 
ane x (Qn)! aT 


The function sin is odd, the function cos is even, sinO0 = 0, and cosO = 1. Both 
functions take real values when applied to real numbers. 


7.4.2 Theorem (Euler’s Formulas) The functions sin, cos and exp are related by 
the following formulas, 


e* =cosz+ i sinz 


: e*=>e”™ 
sinz = i 

ee 4 es 
cosz = 5) ; 


forallz€C. 
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7.4.3 Corollary For all x € R, 


cosx = Ree’* and sinx = Ime”. 


By Euler’s formulas, 
2 Do 
sin*z+cos*z=1 forallz eC, 


which yields 
|sinx| <1 and |cosx| <1 forallx eR. 


The functions sin and cos are unbounded in the complex domain. For example, 
cos(in) > co asn — oo. 

Euler’s formulas allow us to recover many other trigonometric formulas relating 
the functions sin and cos : 


sin(zj + z2) = sin zj cos z2 + cos z; sin z2 
cos(zj + z2) = cos zj COS z2 — sin z1 Sin z2 
Z1 + 22 Z1 — 22 


sin zj +sinz2 = 2sin cos 
2 2 
Z1 + 22 Z1 — 22 
cos z} + cos z2 = 2cos 5 cos 5 


and so on. 
The fact that the functions sin and cos are periodic, of prime period 277, requires 
some extra analysis, which will be done next. 


7.4.4 Lemma The cosine function is strictly decreasing on the interval [0, 6] and 
there is only one real number in this interval, denoted 4%, at which this function 
vanishes. 


Proof Notice first that sinx > 0 for0 <x < V6. In fact, 


and all parentheses are positive on the interval (0, V6). 
If0 <x <y < V6, then (y +.x)/2 € (0, V6) and thus 


_ YX . yrx 
cos y — cosx = —2 sin 5 sin ; <0. 


This implies that the cosine function is decreasing on the interval [0, /6]. 
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On the other hand, 


cosx = | — — a 


2! 4! 6! 
x? x? x x? 

=|] 1 1 or 
2! 3-4 6! 7-8 


and since all parentheses are positive on the interval (0, 6), we get 


x? x? 
cosx <l1——]{1—-—=—}. 
2! 3-4 


In particular, cos2 < —1/3 < 0. 
Since the cosine function is continuous on the interval [0, 2] and at the endpoints 
has values of opposite sign, the Intermediate Value Theorem implies that the cosine 
function has at least one root in this interval. The root is unique because the cosine 
function is strictly decreasing on [0, /6]. 


As Lindemann showed in 1882, the number z (the double of the value of 7 above) 
is transcendental. Its value with 10 digits is 7 = 3.141592 654....The number z 
is related to the formula which gives the length of a circle as a function of its radius: 


L=27R. 


Since the sine function is nonnegative on the interval [0, /6l, cos” 5 +sin? 3 =, 
we get 
TE 
sin — = 1. 
2 


Then, from the trigonometric formulas above, we infer that: 


. _ 0 aa 
sin (5 + x) = sin 7H COSx + cos 5 sinx = cos x 


cos (5 aie x) =Fsinx. 
2 


Since on the interval [0, 2/2] the cosine function is decreasing from | to 0, the 
last formula shows that on the same interval the sine function increases from 0 to 1. 
Then, 


: (mo 1 : 
sin(az + x) = sin (5 +(-+ x) = cos(— + x) = —sinx 
2. 2 2 
cos(z +x) = —cosx 


sin(2z + x) = sin(a + (m7 + x)) = —sin(a7 + x) = sinx 


cos(27 + x) =cosx, 
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Table 7.1 The sign variation 


2 x 0 m/2 4 3/2 2 
of sin and cos ———$ +++ — + 
sinx |0 1 + | 0 1 0 
cosx | 1 0 1 0 { 
Fig. 7.4 The graphs of sin y 
and cos on the interval [0, 277] \ 


sin x cos © 


for all x € R. The last two formulas show that the functions sin and cos are periodic, 
and 277 is a period for both. 

The last group of formulas also shows that the functions sin and cos are strictly 
monotone on each of the intervals 


5) G): [ Z]: E]. 


Their signs vary as in Table 7.1. 
The number 27 is the smallest positive number T such that 


sin(x +7) =sinx forallx ER, 


and a similar assertion is true for the function cos. To see this, notice first that if 
we plug in x = 0, we get sinT = 0. If0 < T < 2m, then the table above implies 
that T = z. This is a contradiction, as sin a = | and sin(> +2) = —1. Therefore, 
T > 2n. 

The graphs of sin and cos on [0, 277] are sketched in Fig. 7.4. 


7.4.5 Theorem (a) The function f(z) = e*, z € C, is periodic, of imaginary period, 
2mi. 

(b) The function f(z) = e!*, z € C, is periodic, of period 21. 

(c) The functions sin and cos, defined on C, are periodic, of period 21. 


Proof (a) In fact, for every two real numbers x and y we have 


e*t!Y — erelY — e* (cosy +isiny). 


The assertion (b) follows from (a), while (c) is a consequence of (b). 
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7.4.6 Theorem (Two basic trigonometric limits) 


_ sinz .  l—-cosz 1 
lim — =1 and lim — y= -. 
z>0 Zz z>0 zz 2 


Proof For |z| < 1, z 4 0, we have 


sinz—z|_ a aan 
z ~ | 3h 5! 7! 
1 1 1 

2 2 

< |z| (Tt+atae) <br, 


which implies the first limit. The second limit can be justified in a similar way. 
7.4.7 Lemma For every x € R, | sinx| < |x|, with equality only if x = 0. 


Proof This inequality is obvious for |x| > 1. Thus, we only need to consider the 
case |x| < 1. Then, 


KP x #° 
sin = (1 a) +5 (1 ac) 
a a x x 
— i 1 1 
= ( 4) = ( -) 


and all parentheses are positive for x € [—1, 1]. Therefore, 


0 <sinx <xifxe(0,1] and 0>sinx > x ifx € [-1,0), 


which completes the proof. 


7.4.8 Corollary For every x, y € R, we have the inequalities 


|sinx — siny| < |x — y| 


|cosx —cos y| < |x — yl. 
Therefore, the functions sin and cos are Lipschitz. 


7.4.9 Remark (The computation of trigonometric sums) The connection between 
the exponential and the trigonometric functions gives us an easy way of computing 
some trigonometric sums. For example, 


1 
5 +cos@+cos20 + ---+cosnd 
1 eH 4 gi? g-280 4 210 ennid 4 enid 
- 2 7 2 ~ 2 ae a ae 
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en 4 eH 4 4 i 4... 4 oni 


2 
end _ o(ntlid p(n +1/2)i0 __ —(+1/2)i0 
2(1 — e!®) ~ 2(e!8/2 — e—i0/2) 
sin(2n + 1)$ 
— —— = fp @ 
2 sin 5 


for alln € N*. Alternatively, denoting 
C=1+cos@+cos20 +---+cosné, 
and interpreting | as cos 0, we have to consider the conjugate expression, 
S= sind + sin20 +---+sinnd, 
and next to compute 


Ci pei p eau oe 
(e@tbia —1) 
ef — ] 
on+l/2)i0 _ 18/2 
eid/2 — e—i0/2 
2i(eM+1/2i8 _ ¢-i8/2y 
5 ' 


sin 7 


From here, C is the real part of the resulting sum, while S is the imaginary part. 
Therefore, 


sin C4Y8 . cos we sin CEY8 . sin 28 
sin 5 sin 5 


7.4.10 The functions Tangent and Cotangent These functions are, respectively, 
defined by 
sin Z IT 
tanz = —, zeEC\{Qk+)=-; keZ} 
COS Z 2 


and 
cotz = aaa zE€C\ {km; k €Z}. 
sin z 


The excluded points are exactly the zeroes of denominators. For example, 
cos z = 0 is equivalent to eZ + | = 0, which translates into 
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ee by 


4 
S 
=) 

R 
nA 
SS 
wa 
R 


Fig.7.5 a The graph of the tangent function on the interval (- us ; eG ) , b The graph of the cotangent 
function on the interval (—z, 77) 


e72y ezix =-] 


via the substitution z = x + iy (with x, y € R). Taking the absolute value of both 


sides, we get e~?” = 1, whence y = 0. Then the equation becomes 


cos 2x = —1 and sin2x = 0, 


from which we conclude that the solutions of the equation cos z = 0 are the points 
z= (2k+1)%, where k € Z. 

The restrictions of the functions tan and cot to R are continuous and periodic, of 
period zr. They are connected through the formulas 


4 
tan (5 + x) = cox. 
2, 


The tangent function is strictly increasing on the interval (—z/2, 2/2), while the 
cotangent function is strictly decreasing on the interval (0, zr). 
Their graphs are sketched in Fig. 7.5. 


7.4.11 The Inverse Trigonometric Functions The function sin is continuous 
and increasing on the interval [—z/2, 2/2], and maps this interval onto [—1, 1]. 
According to Theorem 6.4.6, the function sin induces a homeomorphism between 
[—2 /2, 1/2] and [—1, 1]. We will call the inverse of this homeomorphism the arcsine 
function. It is the function 


: ma 
arcsin : [—1, 1] al ; | 


such that 
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sin(arcsinx) = x forall x € [—1, 1] 


aes an 
arcsin(sinx) =x forallx é€ |-=. =| : 
22, 
Notice that for a € [—1, 1], the equation 
sinx =a 


has the solutions x, = (—1)‘ arcsina + km, k € Z. 

The function cos is continuous and strictly decreasing on the interval [0, zr], 
cos0 = 1, andcosa = —1. According to Theorem 6.4.6, the function cos induces a 
homeomorphism between [0, 2] and [—1, 1]. We will call the inverse of this home- 
omorphism the arccosine function. It is the function 


arccos : [—1, 1] > [0, 2] 
such that 


cos(arccosx) = x forall x € [—1, 1] 


arccos(cosx) =x forall x € [0, z]. 


For a € [—1, 1], the equation 
cosx =a 


has the solutions x, = + arccosa + 2kz,k € Z. 
Notice that i 
arcsin x + arccos x = 3 for all x € [—1, 1]. 


The restriction of the function tan to the interval (—z/2, 2/2) is continuous, 
strictly increasing and 


lim tanx = —o, lim tanx = oo. 
x>-F+ X> FR 
According to Theorem 6.4.6, the function tan induces a homeomorphism between 
(—2/2, 1/2) and R. We will call the inverse of this homeomorphism the arctangent 
function. It is the function 


arctan : R > (-5, =) 
2° 2. 
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such that 


tan(arctan) =x forallx eR 


a 1 
arctan(tan x) =x forallx é (=5 =) : 


For a € R, the equation 
tanx =a 


has the solutions x, = arctana +kz,k € Z. 
The function arccotangent (denoted arccot) is defined in a similar way. 


7.4.12 The Trigonometric Form of Complex Numbers The polar radius of a 
point z = (x, y) from R? is defined by the formula 


— fx24+ y?. 


When r > 0, we notice that 


2 2 
ee ey eee ee 
/ x2 + y2 / x2 + y2 
which implies the existence of a unique number @ € [0, 277) such that 


x : y 
cos @ = ————— and sind = 


[x24 y2 [x2 4. y2’ 
called the polar angle of the point z. In fact, 


arctan (y/x) ifx > 0, y>0 
arctan (y/x) +27 ifx>0, y <0 


6= 4 arctan(y/x) +a ifx <0, yEeR 
mw /2 ifx =0, y>0 
32/2 ifx =0, y <0. 


The numbers r and @ are called the polar coordinates of the point z. See Fig. 7.6. 
They are defined everywhere except for the origin of R?, in which case r = 0, but 
@ is meaningless. 

We can recapture back the Cartesian coordinates via 


x =rcosé 


y=rsingd. 
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Fig. 7.6 Polar coordinates nN 


7.4.13 Theorem The function 
8 : (0, 00) x (0, 277) > R*\ {(x, 0) : x > O} 


given by B(r, 8) = (r cos @,rsin@), is a homeomorphism. 


The details are left to the reader as an exercise. 
Identifying C with R?, the above construction motivates the possibility of express- 
ing a complex number in a trigonometric or exponential form: 


z=x+ iy=r(cosé + ising) 
= rel? 


Using mathematical induction and the trigonometric form of a complex number, 
one can infer easily de Moivre’s Formula for natural powers: 


z” =r"(cosné + isinné). 


Exercises 


1. Check the formulas 


1—r2 


1—2rcosé@ + r2 
r sind 
1—2rcos@+r2’ 


14+ 2rcos6 + 2r2cos26+---= 
rsing +r sin20 +r? sin3@ +--- = 


which work for all r € [0, 1) and all @ € R. 
2. Prove that 


Pe 
sinz = Z— 3 + 0(z3) 


22 
cosz=1— eS + 0(2’) 


as z > 0. 


7A 


. Verify that hyperbolic tangent function tanh x = 
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. (The Abel—Dirichlet Test for series of functions). The series of functions 


120 Gn(x) bn(x) is uniformly convergent on [a, 6] if on this interval the 
sequence (3 ax(x)), 18 uniformly bounded and the sequence (b,(x))y 
decreases uniformly to 0. 

[Hint: See the Abel—Dirichlet Test for numerical series. ] 


. Apply the Abel—Dirichlet Test to prove that the series 


sin nx cos nx 
> — and > ——— 
n n 


n>1 n>1 


are uniformly convergent on every interval [e, 277 — e], whereO < ¢ < Zz. 
[Hint: See Remar 4.9 for the computation of the trigonometric sums.] 


. Prove that ="* sme = =I COS 57 


. (The hpetiete Sancta: The hyperbolic sine function and the hyperbolic 


cosine function are defined in the complex domain by 


es a ee ee 
sinh z = = = 
6 5 aw i Bi 
‘ ka eee So Sa 
ae es ke 
Thus, sin x = —i sinhix and cosx = coshix, for every x € R. Prove that: 


(a) The two functions are continuous and periodic, of period 277i. Moreover, 
2 ee 
cosh* z—sinh*z=1 forevery z €C. 


(b) The function sinh is odd, while the function cosh is even. Sketch the graphs 
of the restrictions to R of these two functions. 

(c) Show that Osborn’s Principle holds: Every trigonometric formula can be 
transformed into a formula for the hyperbolic functions, replacing cos by cosh 
and sin by sinh. 


. Find the general term of the recursive sequence 


Xo > —2, X41 =V24+% forneN. 


sinh x 
: : ; : cosh x 
function f: R — R satisfying the functional equation 


is the only continuous 


f(x) + FO) 


POEMS TE FOF) 


for allx,y ER. 
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9. Find all continuous functions f: R — R satisfying the functional equation 


faty)+ f@—y)=2f@)f(y) forallx,y eR. 


10. Find all continuous and bounded functions f, g: R > R, satisfying the system 
of functional equations, 


faaty) = fMfO) -— g@)9g0) 
gx+y) = fxg) + fO)g@), 


for allx, y ER. 
11. Prove that all continuous solutions f: R — R of d’Alembert’s functional 
equation, 


f(x) — f°) = fat y)f(x—- y), 


f(x) =ax, f(x) =asinox, and f(x) =asinhox, 


where a and w are real parameters. 


7.5 Notes and Remarks 


The elementary functions are the functions of one complex variable built up from a 
finite number of constant functions, power functions, exponential functions, logarith- 
mic functions, trigonometric functions, and inverse trigonometric functions, using 
the four arithmetic operations (addition, subtraction, multiplication, and division) and 
composition of functions, applied finitely many times. Every elementary function is 
continuous on its domain. 

The notion of elementary functions was introduced by Joseph Liouville in a series 
of papers from 1833 to 1841. He proved that e* * is not an elementary function. 

Georg Karl Wilhelm Hamel was the first to notice the existence of additive func- 
tions f: R — R which are nowhere continuous. That means that the conclusion of 
Theorem 7.2.6 is not valid in the absence of supplementary hypotheses on f. Hamel’s 
example is based on Zorn’s lemma (and thus on Axiom of choice). Let (€g)ae4 be 
an algebraic basis of R seen as a vector space over the field Q. Every real number 
x can be (uniquely) represented as a linear combination with rational coefficients of 
the elements of the basis, 


x= > Caea; 


acA 
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here, for each x, at most finitely many coefficients cg are nonzero. The coefficients 
which appear in the representation of x are functions of x : 


Ca: RQ, x > cg(x) (aA € A). 


Because of the uniqueness of the representation of an element with respect to an 
algebraic basis, the functions cg are additive. Indeed, 


x+y=)> calx+ yea = >) calxea + >) calyea 


acA acA acA 
= >) (calx) + caly))ea 
acA 


so that cg(x + y) = ca(x) + Ca(y) for all x, y € R and alla € A. Since they take 
only rational values, the functions cg cannot be continuous unless they are constant. 
Taking into account Exercise 5 at the end of Sect. 7.2, we conclude that the functions 
Cq are continuous nowhere. 

A nice account on the topic of functional equations can be found in the paper of 
Janos Aczél [3]. 

The problem of representing the different functions as limits of sequences of 
functions having special properties has received a great deal of attention. So are the 
functions defined as sums of power series. We will illustrate here the case of Gauss’s 
hypergeometric function, which is based on the following test of convergence: 


7.5.1 Gauss’s Test Let >° ay be a series of positive numbers. Suppose that 


where X and w are real numbers that do not depend on n and (6y)n is a bounded 
sequence. 


IfX <1, the series is divergent. 
If > 1, the series is convergent. 
If = 1, the series is convergent if and only if and > 1. 


The cases where 2 > 1 or A < 1 can be settled via d’ Alembert’s Test. The 3rd 
case follows from the Raabe-Duhamel’s Test. 

The hypergeometric function of parameters a, B,y € R\{0,—1, —2,...} is 
defined via the formula 


a(a+1)...(a+n—  1)6(6 + 1)...B+n—-1) ge 


F(a, B.y.z)=1+ >) yy +l)..(y+n—D nl) 


n=1 


The radius of convergence of the power series is R = 1. The series is absolutely 
convergent at z = 1 if y > a+ # and divergent if y < a + B. The series is 
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absolutely convergent at z = —1l if y > a + f; it is convergent at this point if 
—l<y-—a-— 68 < Oand divergent if y -—a— 6 <-1. 

The hypergeometric function is related to many elementary functions. For 
example, 


Pan, pf, -a4) = (xy 
xF(1, 1,2, —x) = log(1 + x) 


113 , ; 
xF l=, =, =, x* } = arcsinx 
2 2 2 


lim F(,B,1,x/B) =e. 
Boo 


and 


In 1978, Ralph William Gosper [4] indicated a fast algorithm for computing the 
values of hypergeometric functions. An up-to-date account on the main features of 
this algorithm (as well as some of its generalizations) can be found in the book of 
Marko PetkovSek, Herbert Wilf and Doron Zeilberger [5]. See also [7]. 
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Nn 


Chapter 8 
Differential Calculus on R 


Differential calculus is devoted to the study of differentiable functions. Historically, 
there were two sources of differential calculus: the problem of finding the slope of the 
tangent line to the graph of a function, and the problem of finding the instantaneous 
speed of a moving object. 

The tangent line to the graph of a function f at the point P (a, f(a)) is the line 
passing through the point P that gives the best linear approximation of the graph in 
a neighborhood of the point P. See Fig. 8.1. Technically, the problem is to find the 
slope of this line. 

If Q (a+h,f(a+h)) is another point on the graph, then the slope of the line 
passing through P and Q is 


flat+th)—fla) _ flat+h)—-f@ 
(ath)-a h , 


As h approaches 0, the point Q moves on the curve y = f(x), getting closer and 
closer to P. In the same time, the line PQ will oscillate about P, approaching (under 
certain circumstances) a limit position which represents the tangent line to the graph 
at the point P. Precisely, if 


1 flath) — fa) 
m = Lim ——— 
h>0 h 


exists in R, the tangent line to the graph of f at the point P is the line having the 
equation y = f(a) + m(x — a). Geometrically, m = tan@, where 6 € (—2/2, 1/2) 
is the angle made by tangent line with the x-axis. 

The problem of instant velocity (or, more general, that of instant rate of change 
of a function) is also related to differential calculus. Let us consider the case of an 
object moving along a line. The distance traveled by the object in the time interval 
from 0 to f is a function of t, say S(t). 
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Fig. 8.1 The tangent anda 
secant line 


Fig. 8.2. The space variation 


Let to be a specific moment of time. The average speed in the time interval from 
to to tog + his the quotient of the distance traveled and the traveling time, which is 


S(to + h) — S(to) 
a cae. 


See Fig. 8.2. The speed at the time fo will be defined as 


S(t? +h) —S 
vq) = tim SCO = Stto) 
h->0 h 


if the limit exists. In a similar way, we can consider the instant rate of change of the 
velocity, which allows us to define the acceleration at the moment fg by the formula 


_ v(to +h) — v(t) 
a(to) = im ———_——___- 
h->0 h 


8.1 Differentiable Functions 


In what follows, we will consider real-valued functions defined on a nondegener- 
ate interval J, or more generally, on subsets of IR whose all points are points of 
accumulation. 


8.1.1 Definition We say that f is differentiable at the point a if the limit 


i flat+h)-f@ 
im ————— 
h>0 h 
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exists and is finite. The value of this limit is called the derivative of f at a and is 
denoted by 


f(a) or “a or f (a). 


Clearly, 
pa) = tim LOA=LO. 
xa x-a 

If the above limit exists and is infinite, then the function is not differentiable, but 
we say that f has a derivative at a. 

We calla function / differentiable if it is differentiable at every point of the domain. 
In this case, the function which associates to each x the value of the derivative at x 
is called the derivative of f and denoted 


Since 


f@=f@rt 


—a) forallx,ael,x 4a, 


fQ)-f@ 
ee as 
x-—a 


the differentiability at a point implies the continuity at that point. 
It is easy to see, starting from the definition, that most of the usual functions are 
differentiable on R. Here are some examples: 


(C)' =0 (the derivative of a constant function is 0) 
a") =nx"! for everyn €N 
(sinx)’ = cosx 
(cosx)’ = — sinx 
(et) =e 
(a*)' =a'loga. 
Other functions, though continuous, may raise problems. For example, the func- 


tion ,/x, although continuous on the interval [0, 00), is differentiable only for x > 0, 
in which case we get 


1 


(V9 = 3 


At x = 0, this function has derivative equal to oo. Indeed, 
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x—0+ x = Pavia’ Vx a 
More examples follow from the algebraic operations with differentiable functions: 
Fto=f+9 


(of) =o’ 
9) =fotfg 


(5) = a (at all points a where g(a) # 0). 


These operations work pointwise. For example, if f and g are two functions differen- 
tiable at a point a, then f +g is also differentiable at a and (f+. g)'(a) = f’(a) +g (a). 


8.1.2 Theorem A function f : I > R is differentiable at a if and only if there is a 
number i € R and a function w : I > R such that lim @ (x) = w (a) = 0 and 
xa 


fM=f@+tAw-a+oa(x)-|x-a| forallx él. (8.1) 
In this case, 4 = f' (a). 


Proof Let us consider first that f is differentiable at a. Then, it is easy to see that the 
representation formula (8.1) holds true for 4 = f’ (a) and 


f@)-f@—-f' @@-a) 


ifxel,xA~a 
(x) = Beal 
0 ifx=a. 
Moreover, 
fim ec =i (P= - f' (a )): — >| =! 
xa xa a 


For the converse, let us remark that the representation formula for f(x) can be 
restated as 
POET 2 igh 24!) Geet een ate 
x-—a x-—a 
Since the right-hand side has finite limit as x approaches a, it follows that function 
f is differentiable at a and f’ (a) = A. 


By (8.1) and the characterization of continuity using limits, we recover the fact 
that differentiability at a point implies the continuity at that point. 
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Examples of nowhere differentiable continuous functions are presented in 
Sect. 8.11. 

To each function f : 7 + R differentiable at a point a, we can associate a linear 
function 


df(a):R>R, df@)@) =f'(@x 


called the differential of f at a. The conclusion of Theorem 8.1.2 can thus be written 
as 


f@) =f@+df(@(x-a)+o(|\x-al) asx—>a. 


Here y = f(a) + df(a)(x — a) represents the linear part of f and coincides with 
the tangent line to the graph of f at a. 

The composition of differentiable functions (known as the chain rule) follows 
from Theorem 8.1.2. 


8.1.3 Theorem (Chain rule) [ff : 1 > R is differentiable ata andg: J > R 
(with J D f(D) is differentiable at b = f(a), then g of is differentiable at a and 


(gof)@ = F@)-f@. 
Proof According to Theorem 8.1.2, the two functions can be represented as 
fx) =f@ +f @&— a) + w(x) |x — al 


and 
gy) = g(b) +. 9'(B) — 6) + QO) ly — OI, 


where lim w(x) = w(a) = O and lim Q(y) = Q(b) = 0. Then 
xa yb 


If &)) = 9Ff@) +9 F@™ FQ) -—f@) + QF) FO -fFO| 
= gf (a) + fF @)f' (& — a) + @(X) |x — al 


where G(x) = g' F@)o() + Q¢F 09) |=LO 
lim @(x) = 0, we conclude from Theorem 8. i: 2 that g of is differentiable at a and 
xa 

(gof)'(@) = (9 F@))) f(a. 


8.1.4 Theorem (The differentiability of the inverse function) Let f : 1 > R be an 
injective continuous function. If f is differentiable at a, then the inverse function 


f-' :f Z = Lis differentiable at f (a) if and only if f' (a) £ 0. Moreover, 


for x 4 a and w(a) = 0. Since 


-1y/ so 
GY G@O =a: 
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Fig. 8.3, Geometrically, y 
Theorem 8.1.4 expresses the \ 
symmetry of the tangent lines 
to the graphs of the functions 
f and f—! with respect to the 

y =x line 


The geometric meaning of Theorem 8.1.4 is shown in Fig. 8.3. 


Proof Suppose first that the function f—! is differentiable at f(a). Taking into 
account the formula f—! o f = id; and differentiating it at a, we infer that 


gy ¢@) -f @=1, 


whence, we obtain both the fact that f’(a) 4 0 and the formula in the statement. 

Suppose now that f’(a) 4 0. By Theorem 6.4.6, f induces a homeomorphism of 
I onto f(Z). Therefore, if y, — f(a) inf(W) and y, 4 f(a), for all indices n, then 
Xin = f—!(n) — ain/ andx, # a, for all n. Since 


f'od-f1¢@) 1d 
yn — f(a) 7 FedT@ ~~ f(a) 


as n — oo, by Heine’s Characterization of Limits, we conclude that 


a = 
1) aay = ym PLOT EO) 
(F) F@) yop) y—f@) 


exists and equals 1/f’(a). 
Theorem 8.1.4 yields new important examples of differentiable functions: 


1 
(log x)’ = - for x > 0 
x 


1 
(arcsin x)’ = ———— __ forx € (-1, 1) 


V1—x2 


-1 
(arccos x)’ = ———— __ forx € (-1, 1) 


V1 —x2 


1 


eee forx eR 
x 


(arctan x)’ = 
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—1 
(arccot x)! = ia xeER. 


This allows us to settle the differentiability of the power function. Indeed, 
x? = e7!98* and thus 


(x*)’ =ax*-!, forx € (0,00) andaeR. 


The left and the right derivative at a point a are, respectively, defined by 


Per= in fe) -F@) F(a) 

x>a— =v 
“ (x) — F(a) 
fies m / x mn a 


The differentiability at an interior point is equivalent to the fact that both the left 
and right derivatives exist, are finite and equal. When the function is continuous 
but not differentiable at a point a, the existence of the left derivative at a leads to 
the existence of left tangent line to the graph. This has the equation y — f(a) = 
f/(@(« — a), x < a, if f/(@) is finite, and x = a, if f/(a) is not finite; in the second 
case the tangent line is a vertical line. A similar discussion refers to the case of right 
derivatives. In analytic geometry, a point of continuity where the left and the right 
derivatives exist, are different and at least one is finite is called an angular point. A 
point of continuity where one of the one-sided derivatives is 00, while the other one 
is —oo is called a vertical cusp . 

The function f(x) = |x| is not differentiable at 0 but has finite one-sided deriva- 
tives, 


f( 0) =—-1 and f'(0) =1, 


so that the origin is an angular point. Other examples are shown in Fig. 8.4. 
The derivatives of order for n > 2 are defined recursively by the formula 


d"f d d-' 
dx” — dx \dxr-! J° 


Note that to define # af at a point a, we need the existence of the derivative of order 
(n — 1) on a neighborhood of a. 
For completeness, we define the derivative of order 0 as being the function itself: 


df 


ao 
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(a) (b) 
YA a 
| ; ea eee 
0 Be 0 x 


Fig. 8.4 a The points +3 are angular points for f(x) = |x? =9 


fx) = Vix] 


, b The origin is a cusp for 


The derivative of order n is also denoted by f. However, it is customary to 
denote lower order derivatives using Roman numerals: 


Pfr ai oo. 


The following formulas hold for higher order derivatives: 


(f+ g)® = f™ ae g”; 
(af) = af; 


n 
(fg) = > yee (Leibniz’s Formula). 


k=0 


Leibniz’s formula can be easily proved by using the Principle of Mathematical 
Induction. 

The Faa di Bruno’ Formula for the nth derivative of a composition of functions 
is indicated in the Notes and Remarks at the end of this chapter. 

The polynomial functions, the exponential, the logarithm, as well as the sine and 
cosine functions, are all infinitely differentiable, that is, they have derivatives of every 
order. The functions having continuous derivative of order n are called functions of 
class C", while the functions having derivatives of every order are called functions 
of class C™. 

The function f (x) = |x|?"*! is of class C?” but doesn’t have a derivative of order 
(2n + 1) at 0. 

For n € N U {oo} and a nondegenerate interval J, we denote by C”(/, R) the 
set of all real-valued functions of class C”, defined on J. The three formulas listed 
above show that C”(/, R) is a commutative unital algebra (with respect to the usual 
operations). 

If J and J are nondegenerate intervals, a function f : J — J is called a diffeomor- 
phism (respectively, a diffeomorphism of class C" or a C” diffeomorphism) if f is 
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bijective and both f and f~! are differentiable (respectively of class C”). Applying 
Theorem 8.1.4, we deduce the following criterion: 


8.1.5 Theorem Let f : 1 — J be a differentiable homeomorphism between two 
nondegenerate intervals. If f'(x) #4 0 on I and f is of class C", then f is also a 
diffeomorphism of class C”. 


Proof By Mathematical Induction. The case when n = | is a direct consequence of 
Theorem 8.1.4. 

Suppose now that the statement is valid for all differentiable homeomorphisms 
of class C”! (n > 2) having nonzero derivative at all points and let f : J > J bea 
differentiable homeomorphism of class C” such that f’(x) 4 0 on J. 

According to Theorem 8.1.4, the function g = f~! is differentiable and 


JQ) = forall y € J. 


1 
f(g) 
Therefore, g’ is the composition of three functions: g : J > I, f’ : 1 > R\{0}, and 
t > 4, from R\{0} into itself. 

By our hypothesis of induction, g is of class C”~!. Since f’ is also of class C”~! 
and the function t > 4 is of class C® on R\{0}, we conclude that g’ is of class 
co 

We will show in the next section that every derivative has the intermediate value 
property. Therefore, if f’(x) 4 0 on an interval J, then either f’(x) > 0 or f’(x) < 0 
on the whole interval. 


8.1.6 Remark Itis useful to extend the concept of differentiable function to the case 
of complex-valued functions (or to vector-valued functions) using Definition 8.1.1. 
Since the complex limits are computed coordinatewise, the differentiability of a 
function f : / > C means exactly the differentiability of Ref and Im/. In this case 
we have: 


f= (Ref) +idmfy. 


In particular, 


oie 4 
(c*) = (cos ax +i sinax)’ 


= —asinax +iacosx =iae™. 


Exercises 


1. Let f(x) = x*|x—a|—|x—b| bea function which depends on two real parameters 
a and b. Under what conditions is this function differentiable on R? 

2. (a) Find f¢™ for f(x) = aera 
(b) Find g@ (0) for g(x) = arctan x. 
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[Hint: In the case (b), notice that (1 + aad (x) = 1.] 

3. Suppose that the one-sided derivatives ff (a) and f/(a) exist and they are finite. 
Show that f is continuous at a. 

4. Suppose that f : (a,b) > R is an injective function of class C! with f’(x) 4 0 
for every x € (a, b). Prove that f ((a, b)) is an open interval. 

5. (Multiple roots). Consider a polynomial function P(x) and @ a real number. 
Prove that @ is a root of multiplicity k > 1 if and only if P(@) = P’(a) =--- = 
P&-D(q@) = 0 and P (a) £0. 

6. A function f : 7 > R is said to be strongly differentiable at a point a if there 
is a number A such that for each ¢ > 0, one can find a number 6 > O for which 
x1,x2 ET, |x; — al < 4, |x2 — a| < 6 implies 


If (x1) — f (2) — AG — x2)| < € |x] — x2). 


Prove that: 

(a) The function f defined by f(0) = 0 and f(x) = 5x nO a sin(1/x) for x 4 0, 
is differentiable at the origin but not strongly differentiable. 

(b) The function g defined by g(0) = 0, g(1/n) = 1/n? for n € N*, and 
extended by linearity to all intervals [1/(n + 1), 1/n], is strongly differentiable 
at the origin, but not differentiable in any neighborhood of the origin. 

(c) All functions of class C! are strongly differentiable. 

(d) The strongly differentiable functions are locally Lipschitz. 


8.2 The Monotone Function Theorem 


Many applications of analysis concern the connection between the monotonicity of 
a function and the sign of its derivative. In order to establish this relation, we need 
the following technical result: 


8.2.1 Lemma Let f : [a, b]— R be a function which is differentiable at a point c 
of (a, b) and let (ay)n and (bn)n be two sequences in [a, b] converging to c such 
that an < ¢ < by, for all n. Then 


li FS (bn) —f (an) -_ 
im — - — = 


n> 0o bn — an 


Fe 


Proof Notice that 


fbn) —f (an) _ bn-c fbn) —f) coe fc) ~ FG) 


bn — an bn — An by —c bn — an C— an 


where (by, — c)/(bn — ay) and (c — ay)/(bn — an) are positive numbers that sum to 
1. This implies that (f(bn) — f(Gn))/(bn — an) lies between the minimum and the 
maximum of the numbers 
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Fn) — Fe) f() =f (an) 
———_——— and -———_—_.. 


by —c C— ay 


Since both fractions converge to f’(c), the conclusion of the lemma follows from 
the Squeeze Theorem. 
The idea of the next proof is due to Pompeiu. See also Exercise 11, Sect. 8.3. 


8.2.2 Theorem (The first-derivative test for monotonicity) If f’ > 0 on an interval, 
then fis increasing on that interval. 


Proof Suppose that f’ > O on an interval J but there are a; < bj; in J such that 
f(a) > f (1). Then 


_ fbi) —fla) 
m = —————— < 


0. 
bj -ay 


Moreover, for every point x € (a), b;), one of the following fractions 


SQ) — fa) f(1) — fF) 
——_—— and ———— 


x-a, by —x 


is at least m, while the other one is at most m (notice that m is a convex combination of 
these fractions). Dividing the interval [a1, b;] in halves and using the remark above, 
we get two sequences (a,)n and (by), such that 


Gn < Ant1 < Dati < bn 
(f (bn) —fGn))/(n — an) Sm 


lim (by — dn) = 0. 
n— Ooo 


Let c be the common limit of the two above sequences. By Lemma 8.2.1, we 
infer that f’(c) < m < 0, which is a contradiction. Consequently, f is increasing 
on I. 


8.2.3 Corollary (a) /ff’ < 0 on an interval, then f is decreasing on that interval. 
(b) If f’ = 0 on an interval, then f is constant on that interval. 
(c) If f’ > 0 onan interval, then f is strictly increasing on that interval. 


The results in the last corollary can be made even more specific: [f a function f is 
differentiable on an interval, thenf is increasing if and only if f’ = 0 on the interval. 
The function f is strictly increasing if f’ => 0 and the set where f' = 0 does not 
include any nondegenerate interval. 

D. Pompeiu constructed in 1907 an example of a strictly increasing differentiable 
function whose derivative is equal to 0 on a dense set. This example will be presented 
in the Notes and Remarks at the end of this chapter. 

Applications of Corollary 8.2.3(b) to second order linear differential equations 
with constant coefficients are given in Exercises 7 and 8. 
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Corollary 8.2.3 also offers a very convenient way to prove many inequalities. We 
mention here the car race principle: if a car has the highest speed, then the car will 
be in front at any given moment. In mathematical terms, this principle is stated as 
follows: 


8.2.4 Theorem Let f and g be two functions which are continuous on an interval 
I and differentiable on the interior of I. If f'(x) < g(x) at every interior point of I, 
then 


f@) -f@ < g@) -— g(a) forall x,ael,x >a. 
Proof By Corollary 8.2.3, the inequality holds on any closed interval included in the 


interior of 7. Through a limit process, we obtain that the inequality also holds at the 
endpoints of the interval (the endpoints that belong to /). 


8.2.5 Corollary If f is continuous on [a,b], differentiable on (a,b) and m < 
f'(®) < Mon (a,b), then 


m(x — a) < f(x) —f(@ <Ma-—a) for allx € [a, bl. 


Theorem 8.2.4 and Corollary 8.2.5 are an important source of inequalities. Some 
of them are mentioned in Exercises 1-4. 
Exercises 
1. Prove Bernoulli’s inequality: For x > 0, 


x*>1+ax—-1) ifa € (—-o, 0] U[I, ow) 
x*<1l+a(x-—1) ifae [0,1]. 


2. Use Bernoulli’s inequality to prove Young’s inequality: For every x, y > 0, 


xty!4 > ax+(1—a)y ifa € (—00, 0]U[I, 00) 
xayloa <ax+(l-a)y ifae [0,1]. 


3. Infer from Young’s inequality the Rogers-H6lder inequality: If p,g > 1 and 


1/p + 1/q = 1, then 
n I/p n 1/q 
< bs ar) (> mut) 
k=1 k=1 


for all aj,..., Gn, 01,..., bn € R, and all integers n > 1. 
4. (Jordan’s inequality). Prove that 2x/m < sinx < x, forall x € (0, 7/2). 


5. (1. Schur). Prove that if a € R, then the sequence ag(n) = (1 + - 
decreasing if a € [. oo), and increasing forn > N(qa) if a € (—oo, 1/2). 
[Hint: Consider the function f(x) = (1 + a 


n 


> agby 


k=1 


ie 1S 


forx > 1.] 
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6. Solve the linear differential equation 
uo +5u=0, 


that is, find all functions w € C!(R) such that u/ (x) + 5u(x) = 0, for all x € R. 
[Hint: Multiply the equation by e>*.] 

7. (Second order linear differential equations with constant coefficients). Consider 
the differential equation with real constant coefficients 


u’ + bu’ +cu=0. 


A solution of this equation is a function u € C*(R) such that uw’ (x) + bu! (x) + 
cu(x) = 0, for all x € R. The aim of this exercise is to find the general form 
of the solutions and to conclude that they form a two dimensional linear space 
(consisting of infinitely differentiable functions). 

(a) Verify that any solution of the quadratic equation A? + bA +c = 0 generate a 
solution u = e’* of the above equation. Convert them into a pair of real-valued 
independent solutions if 4 € C\R. 

(b) Prove that there exists a real number jz such that the function v = ue~“* 
verifies a simplified differential equation of the form 


v" —bv=0 
where 6 = 7>4¢. 
(c) If 5 £0, put 
oe J5 ifd>0 
~ | isd ifs <0 


and prove that the function w = v’ + wv verifies the condition w’ — ww = 0. 
This fact implies that w = Ce®* for an arbitrary constant C. 
(d) Prove that the condition v’ + wv = Ce®*is equivalent to (ve®*)’ = Ce?®* 
and conclude that all solutions of the equation v” — w?v = 0 are of the form 
v = Cye ® + Cre®*, where C, and C? are arbitrary constants. 
(e) For 6 = 0, the general form of v is v = Cyx + C2, where C; and C2 are 
arbitrary constants. 
Remark. If 5 < 0, the equation u” + bu’ + cu = 0 admits a real algebraic basis 
of solutions consisisting of two functions, e“* cos /—65x and e#* sin ./—dx. 

8. Find all values of the real parameter 4 for which the differential equation 
u’ + Au = O admits a nonzero solution that verifies the conditions u(0) = 
u(1) = 0. 
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8.3 The Basic Theorems of Differential Calculus 


One of the main applications of the differential calculus is to provide easy to check 
necessary and/or sufficient conditions for the extrema of functions. 

In this section, we will consider only functions defined on nondegenerate intervals. 

We say that a function f : J > R has a local maximum at a point p if there is a 
neighborhood V of p, such that f(x) < f(p), for all x ¢ VMI. We say that f has a 
local minimum at a point p if there is a neighborhood V of p, such that f(x) > f(p) 
for all x € V NTI. The local maximum and local minimum points are called local 
extrema. 

If the inequality holds for all x in the domain of the function, then the point will 
be called global maximum (global minimum, respectively). 

We say that a function f has a local strict maximum at a point p if there is V, 
a neighborhood of p, such that f(x) < f(p), for all x € V, x & p. We Say that a 
function f has a local strict minimum at a point p if there is V, a neighborhood of 
p, such that f(x) > f(p), for allx € V, x # p. The local strict maximum and local 
strict minimum points are called local strict extrema. 


8.3.1 Fermat’s Theorem Let f : 1 — R be a function which has a local extremum 
at an interior point a of I. If f is differentiable at a, then f' (a) = 0. 


Proof Without loss of generality, we may assume that a is a maximum point. Then 


fa) -f@ f &) —f @ 


> Oforx < aand —————- < O forx > a. Since f is differentiable 
x-a x—a 
at a, it follows that f/(a) = 0 and f/(a) < 0, hence f’ (a) = 0. 


The points where the derivative equals 0 are called critical points. 

Fermat’s Theorem asserts that every interior point of local extremum for a differ- 
entiable function is necessarily a critical point. The converse is not true. For example, 
0 is a critical point for the function f(x) = x°, which is strictly increasing on R. 
Thus, this point cannot be a point of local extremum. 

The case of the function f(x) = x (for x € [0, 1]) shows that the hypothesis that 
a is an interior point is essential for Theorem 8.3.1. 

Fermat’s Theorem has many important consequences that will be presented below. 


8.3.2 Theorem (G. Darboux) /fa real-valued function is differentiable on an inter- 
val, then its derivative has the intermediate value property on that interval. 


Proof It suffices to show that if f : [a, b] — R is a differentiable function such that 
f'(@ < Oand f’(b) > 0, then there exists a point c € (a, b) such that f’(c) = 0. 
Indeed, by taking into account that the derivative at a point is a limit, we easily 
infer that f has strict local maxima at a and b, so that f attains its minimum at an 
interior point c € (a, b). The existence of the minimum follows from the Weierstrass 
Theorem. According to Fermat’s Theorem, f ’(c) = 0. 


See Exercise 6 for an alternative proof of Theorem 8.3.2. 
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8.3.3 Rolle’s Theorem Let f be a real-valued function that is continuous on [a, b], 
differentiable on (a, b) and f(a) = f(b). Then, there is a point c € (a, b) such that 
f'(c) =9. 


Proof If the function is constant, then every point works. If the function is not 
constant, then the maximum and the minimum of the function are attained at different 
points, say x, and xy; the existence of minimum and maximum follows from the 
Weierstrass Theorem. Since f (a) = f (b), only one of the points x, and xy can be 
an endpoint of the interval. Therefore, at least one of them is an interior point and 
so, by Fermat’s Theorem, at that point the derivative equals 0. 


See Exercise | 1 for an alternative proof that avoids the use of Weierstrass Theorem. 

Rolle’s theorem implies that between two consecutive zeros of a differentiable 
function, there is at least one zero of the derivative. Therefore, between any two con- 
secutive zeros of the derivative, there is at most one zero of the function. This remark 
is the basis of a well known technique of separating the zeros of a differentiable 
function called Rolle’s sequence. 


8.3.4 Cauchy’s Mean Value Theorem Let f and g be two real-valued functions 
that are continuous on [a, b] and differentiable on (a, b), and suppose that g' (x) 4 0 
on (a, b). Then: 

(a) g(a) Fg(b); 


(b) there is a point c € (a, b) such that 


fO-f@ _ fo 
g(b)-gla) gc) 


Proof The first conclusion follows from Rolle’s Theorem. For the second, let 


by — 
y (x) =f () a. (g(x) —g(@)_ forx € [a, 5]. 
g(b)-9@ 


The function @ satisfies the hypotheses of Rolle’s Theorem and thus, there is a point 
c € (a, b) such that gy’ (c) = 0. Hence, 


f (6) -f @ 


pg 0, 
sO=9@ 2 


f'(c)- 


and the conclusion is now clear. 
Cauchy’s Mean value Theorem has a lot of applications in analysis. A particular 
case is the following theorem due to Lagrange. 


8.3.5 Mean Value Theorem (also known as Lagrange’s Mean Value Theorem) Let 
f : [a,b] — R be a function which is continuous on [a, b] and differentiable on 
(a, b). Then, there is c € (a, b) such that 


fb)-f@=f' ()(b-a). 
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As shows the restriction of e”* to [0, 277], the Mean Value Theorem and Rolle’s 
Theorem both fail for complex-valued functions. 

It is worth noticing that the existence of the intermediate point c in Theorems 8.3.4 
and 8.3.5 is not accompanied by any constructive procedure of finding it. Neverthe- 
less, the Mean Value Theorem is the source of a useful estimate: 


f(b) -—f@| < @-a)- ee If’ @)I. 


a<x< 


This can be extended outside the class of differentiable functions. Iff : [a,b] ~ R 
is a Lipschitz function, then 


f(x) — FOI S IF llziplx — yI 


for all x, y € [a, b], where 


fv) ~f@ 


flip = sup 
v—u 


u,vela,b],uAv 


represents the Lipschitz constant of f. When f is differentiable and f’ is bounded, 
then f is Lipschitz and ||f||zip = sup [f’(x)]. 
x€[a,b] 
From the Mean Value Theorem, it is easy to deduce the Monotone Function 
Theorem and all its consequences mentioned in the previous section. We want to 
point here one more application. 


8.3.6 Corollary Let f be a function continuous on [a,b], differentiable on (a, b) 
and such that there is . = lim, f" (x). Then, f has a right-hand derivative at a and 
xa 


fi@) =). 


A similar result works for the left-hand derivative at b. 


Proof We apply the Mean Value Theorem for f|[a,xj to infer that there is cy € (a, x) 
such that 


_f@)-f@ 


X—a 


ve (Cx) 


Clearly, cy — aas x — a and thus limf’ (cy) = 4. Therefore, the right-hand 
xa 


derivative f(a) exists and equals A. 
Next, we will show how to use some of the above theorems to compute limits. 


8.3.7 Bernoulli-L’ H6pital Rule for the Indeterminate Form Let f and g be 
two differentiable functions on an interval (a, b) ( with -o < a < b < w&) such 
that g' (x) # 0 for all x € (a, b). If 


lim f(x) = lim g(x) =0 
xa xa 
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£@) s also exists and 


and lim £, 
P Sarg 


LG) exists in R, then jim oq 


lim f£@) _,. f'@) 
im = lim F 
xa g(x) x—a g’ (x) 


A similar statement holds at b. 


Proof Since the derivatives have the intermediate value property (see Theorem 8.3.2), 
it follows that g’ has constant sign on (a, b). Without loss of generality, we may 
assume that g’ > 0. By Corollary 8.2.3(c), this implies that g is strictly increasing 
(and also positive) on (a, b). 

If a € R, then we can extend both functions f and g at a by continuity and the 
conclusion of Bernoulli-L’ Hépital Rule can be derived from Cauchy’s Mean Value 
Theorem. Indeed, according to this theorem, for each x € (a,b), there is a point 
Cx € (a, x) such that 


f@ _f@-f@ 4 (cx) 
gx) g@-ga@ g(x) 


As x approaches a, so c, approaches a and thus lim f (cx) = lim £ ey Conse- 
xa J (Cx xa JO) 
ner 62 f'@) 
quently, iim aa = lim VG)" 
If a = —oo, then we will choose a number c € (—ov, b) and observe that the 
change of variable x = c — 1/t establishes a diffeomorphism between the intervals 


(0, co) and (—ox, c). Since 


f'(@) _ f'(e-4F) — fie- tyr? _ £f(e-F) 
= lim lim = lim 


eee f = Jigs lea By ~ £50 i (¢ — 1)}-2 to04+ 4 Baten 1 
g' (x) + g(e—4) + gf (c—7)t Pres) 


we infer from the previous case that 


fe-}> ff @ 
€= lim —— lim —. 
130+ g(c — +) " x>=00 g(x) 


8.3.8 Bernoulli-L’H6pital Rule for the Indeterminate Form & Let the real- 

valued functions f and g be defined and differentiable on an interval (a, b) ( with 

—0oo <a <b < w)and g(x) £ O forall x € (a,b). If lim |g(x)| = co and 
xa 


tia f'@) cae s 


lso exists an 
xa 7 @) : d 


exists in R, then lim 
x 


IO into 
im 
eres g(x) xa g' (x) 


A similar statement holds at b. 
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f'@) 


Proof We will consider first the case where both a and ¢ = lim Toy are real 
Xa - 
numbers. Without loss of generality, we may assume that g’ > 0 (which implies 


lim g(x) = —oo). 
xa 
Let e > 0. By the choice of £, there is 6 > 0 such that 


(-e)d @<f'@<(E+e)9'@), 
for all x € Vs = (a,a+ 6]. Then by Theorem 8.2.4, we get 


gel Mr Ort 2pie 
g(a+ 5) — g(x) 


that is, 


2 0 [8 ner?) [1 g(a+8) 


0 ‘ 
re ar es ga) |= = 


for all x € (a,a+ 6]. Since 


_ f(at+s) _ g(atd) 
lim ————— = lim a 


xa g(x) xa g(x) 


’ 


this yields 
L-e< lim int! < jinn <f+6, 
x>a g(x xoa g(x) 
and thus @ < lim inf < lim sup < £, as € > O was arbitrarily fixed. 
rn GO) ee ge) 


The other cases can be treated by adapting the argument above. For example, 
when a = oo and £ = ov, we Start by noticing that for e > 0 arbitrarily fixed, there 
exists 5 > 0 such that 4 a > € whenever x > 6. By Theorem 8.2.4, we infer that 
f (x) — f (6) = € (g(x) — g(6)) for x = 6, whence 


lim int! © >eé 
X00 g(x) 
As € > O was arbitrarily fixed, we conclude that lim inf“ = oo and thus 
x00 9) 
lim L2 = co 
x00 9) : 


Under certain circumstances, it is possible to prove converses of Bernoulli-L’ Hépital 
Rules. See Exercise 11 at the end of Sect. 8.9. 
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Exercises 


1. 


2: 


10. 


(a) Solve the equation 2* + 5* = 3* + 4°. 
(b) Locate the roots of the equation 2* = x. 


The Legendre polynomial of degree n is defined by 


P= = s [«? ny"). 


Prove, using Rolle’s Theorem, that this function has n distinct real roots, all in 
the open interval (—1, 1). 


. Infer from the Mean Value Theorem that for all x, h € R, 


|sin(x + A) — sinx| < |h|. 


. What is the origin of the approximation formula 


x 
Vae+xra+—? 
2a 


Here, a > 0, and x is supposed to be close to 0. 


. Find the value of the parameter a > 0 such that 6* + 1 > 2*+ a‘, forallx € R. 
. Letf be acontinuous function on an interval /, and let C denote the set of all slopes 


(oan :srel 


of chords joining any two points on the graph of f that is, C = 
and s 4 r}. 

(a) Prove that C is an interval. 

(b) Infer from (a) and the Mean Value Theorem that the derivative of a differen- 


tiable function has the intermediate value property. 


. Consider the function /Q (x) = sin i for x 4 0 and f(0) = a. Prove that: 


(a) The function f,(x) has the intermediate value property if and only if 
a e€[-1, 1]. 
(b) The function f(x) is a derivative if and only if a = 0. 


. (V. Jarnik) Suppose that / is an interval and f,g : J — R are two differen- 


tiable functions such that g’(x) 4 0. Prove that f’/g’ has the intermediate value 
property. 
[Hint: If f’(a)/g'(a) < 4 < f'(b)/g'(b), apply Theorem 8.3.2 to the function 


. Let F : [a, 00) > RbeaC! function such that F(a) = 0 and lim F(x) = 0. 
X—0O 


Prove that F’(c) = 0 for some c € (a, 00). 
If f is twice differentiable on (a, 00) and 


a f[@)= lim f(x) =90, 


show that there is a point c € (a, 00) such that f”(c) = 0. 
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11. (A proof of Rolle’s Theorem due to Pompeiu [1]). Let f : [a,b] ~ R bea 
continuous function such that f(a) = f(b). 
(a) Prove that there exist points u;, vj in [a, b] such that vy} — uy = (b—a)/2 
and f(u1) =f (1). 
(b) Then infer the existence of two sequences of points (uy), and (vp), such that 
a < Un < Vn <b, Vy — Un = (b — a)/2”, and f (un) = f (vn), for all n. 
(c) By the Nested Intervals Lemma, we know that the intervals [u,, v,] shrink 
to a point z (necessarily in (a, b)). Use the formula 


fn) —f Un) = Vn — 2 fn) —f (2) dy Z—Un f(z) —f (Un) 


Vn — Un Vn — Un Vn -&% Vn — Un Z— Un 


0= 


to infer that 0 lies between £ Cn) £ @ and £& 2 = (un) , for all. Assuming that f is 
also differentiable at all polite or (a, b), sonehedé that f’(z) = 0. 

(d) Show by an example that the value off at z is not necessarily a local maximum 
or a local minimum. 


8.4 Taylor’s Formula 


In this section, we will present a generalization of the Mean Value Theorem for 
approximating a given function by polynomials. 

Let J be a nondegenerate interval, a a point in / and f : J > R ann times 
differentiable function at a. We are interested in the relationship between f and the 
Taylor polynomial of order n for f at a, 


n k 
T,(x) = > eae (a). 
k=0 , 


The Taylor polynomial 7,,(x) is the unique polynomial of degree n which has the 
same derivatives as f at a up to order n, 


T®(a) =f@ fork =0,1,2,...,n 
The main problem under attention is the evaluation of the difference 
Rn (x) =f (x) — Tn), 
usually known as the remainder of Taylor’s Formula. This will be done under some 
extra hypothesis about the smoothness of f. In the case of the Mean Value Theorem, 
we required the continuity (equivalently, 0 times differentiability) on J and differen- 


tiability at the interior of J in order to prove that for every a, x € I, there is c between 
a and x such that 
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fa) =f@ + &-af'(). 
In this case, n = 0, T,(x) = f(a), and R, (x) = (x—a)f’ (c). The case of differentiable 
functions of higher order makes the objective of the following result: 


8.4.1 Theorem (Taylor’s Formula) Let f : 1 > R be a function which is n times 
differentiable onI andn+| differentiable on the interior of I. Then, for every a, x € I, 
there is a point c between a and x such that 


(x — a)" 


= _ ,\yntl 
fF) =f@+ Ff @ +0 + g¢ 2-2" pe @, 


(n+ 1)! 
Proof We need to show that 


(x 7 a)'tl 


Ry (x) =f (x) — Tn (x) = Ga Zz DI f"De) 


for some c between a and x. We start by noticing that 
R, (a) = Ri (@) =: =RM @=0 and ROMY GG =f") @. 


By a repeated application of Cauchy’s Mean Value Theorem, we get 


Rn @) Rn (&) — Rn @) _ R,, (c1) 
(x — q)ttl ~ (x — qt! =e = a)"tl ~ (n+ 1) (c} — a)” 
1 Ri (€1) = K, (a) 1 : RK, (c2) 


n+1 (cy—a)"—(a—a)" n(n+1) (c@.—a)""! 


RY (ea lL RP) =k @ 

ee n-(nt+l)(ca,—a) (n+D! (cn —a@) —(a—a) 
RO fC 
"Git aah? 


8.4.2 Remark For the functions of class C” on the interval J, Taylor’s formula has 
the form 


f(&) = Tnx) + 


Cae 
nl 


where w : J > Ris a function such that lim w(x) = w(a) = 0. In other words, 
xa 


f(x) = Tre) + o(|x — al") (8.2) 
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as x approaches a. Indeed, by Theorem 8.4.1 we have 


= _ n-1 yn 
; (n— I)! n! 
=f(at+ seed (a) tee + @ =a)" eq (a) + SO ist) 
1! nl nl 
(x — a)" 
= Tnx) + ; w(x) 
n! 
where 
w(x) = ian) —f{(@) ifx Aa 


0 ifx=a. 
Since f ”) is a continuous function, we conclude that 

lim w(x) = w(a) = 0. 

Aa 


For an alternative approach of formula (8.2), see Exercise 3, at the end of this 
section. 


Taylor’s formula has many applications. For example, the well known formula in 
kinematics describing the linear motion of an object possessing uniform acceleration, 


1 
S(t) = S(O) + vot + 5 at’, 


is an example of Taylor’s formula of order 2. 

We will show here only two more applications. Several more are included as 
problems or will be presented in the next three sections. 

We start by describing the Newton—Raphson iterative method of finding roots of 
a two times differentiable function f defined on an open interval 7. We assume that 
f has a root @ in this interval and moreover, 


A = inf f'@x)| > 0 and B= sup P"@)| < ©. 
xel xel 


Choosing an initial point x9 € J, we consider the sequence of successive approx- 
imations 


fn) 
f' Gay 


(8.3) 


Xn+1 = Xn 


Geometrically, x,+1 is the abscissa of the point where the line tangent to the graph 
of f at (Xn, f Qn)) intersects the x axis. See Fig. 8.5. 
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Fig. 8.5 The geometry of the 
Newton-Raphson method 


According to Theorem 8.4.1, 


PO. 


5) ie 


f(a) =f (Xn) +f’ Gn)(a Xn) + 
for some c between @ and x,. Since f(a) = 0, we get 


, £00. gy = FO _@ 49 


f'n) 2f '(xn) 


and thus 


B 2 
Xn] —al< 54 On — a)’, 


a fact that assures the fast convergence of (x,), to a. 

It is important to notice that in the case of the function x? — a (where p > 2 is 
a natural number and a > 0 a real number) the iterative formula (8.3) provides the 
following generalization of the Babylonian algorithm for computation of the pth root 
of a : whenever xo > 0, the sequence given recursively by 


1 a 
M41 ==-—[ @- Dxnt+ +) forn => 0, 
i( a 


is convergent to 2/a. 
The next application of Taylor’s formula relates the variation of the first derivative 
to the variations of the function and of its second derivative. 


8.4.3 Theorem (Landau’s inequality) Let f € C* (R,R) be a function such that 
both f and f" are bounded. Then, f' is also bounded and 


M; < 2MoM2, 


where My = supyer If (x)| for k € {0, 1, 2}. 


238 8 Differential Calculus on R 


Proof Let x € Randh > 0. By Taylor’s formula, 


h2 
FEAM=FO+P WM h+f ey) > 


and 


h2 
fa-N=f@—-f @MA+f" (2) Zz 


for suitable points c; and c2. 
Subtracting the two relations, we get 


h 
7 Qh=(e+h)-(~6-H- if c)-f jl 


2 
2 
and thus, 


atm —fa-M , [P(e -F" @a)| 


/ < h 
yf’ @| < oh 4 
My . Mo 
<—+—h 
~ h 2 
Consequently, lf’ (x)| < inf {2 + Mon th> o| = (2MoM2)!/2. 
Exercises 
1. Prove that the identity 
iT ttl 
—— =1]4+xte--tx74+ , xeE(-o, 1), 
1-x 1-x 
is nothing but a special case of Taylor’s formula applied to f(x) = a and 


a = (0). What about the case x € (1, co)? 
2. Compute, using Taylor’s formula, the following limits: 


_ cosx — ee /2 . l1fil . 2 1 
lim ———4 lim —{-——cotx}; lim |x—x log{1+-—-J]]. 
x0 x x70 xX \x X00 x 


3. Extend inductively the assertion in Corollary 8.2.5 to obtain the following esti- 
mate of the error in Taylor’s formula: Let f : I > R be a function which is n 
times differentiable on the interval I and n+ | times differentiable in the interior 
of I. Suppose that m < f+) (x) < M on the interior of I. Then, for every x 
and a in I, we have 


(x _ a)'tl (x _ a)ttl 
-—__ < R, (x) <M» ———__. 
(n+ 1)! (n+ 1)! 
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4. In numerical analysis, the derivatives are approximated with formulas of the 


following types: 
h)— 
peje tenes 
h) - —h 
few? — ie ) 
fe) LEAD AL = M) = 260) 


h2 


Use Taylor’s formula to evaluate the error in each of these formulas. 


8.5 Taylor Series Expansion of a Function 


Given an infinitely differentiable function f on an interval J and a point a € I, we 
can associate to them a special power series called Taylor series centered at a, 


> (x _ a)" fF (a). 
n! 


n>0 


The particular case where a = 0 was first considered by Maclaurin and so is 
called Maclaurin series. 

The following example, due to Cauchy, shows that, in general, a function may not 
be equal to its Taylor’s series. Let 


—1/x? if x 0 
_ fe if x 
a) eee 
Then, f (0) = f’ (0) = f” (0) = --- = O and the Maclaurin series of f has all 
coefficients equal to 0; see Exercise 1. On the other hand, the function f is equal to 


0 only at 0. 
The next result indicates a sufficient condition under which a function admits 
locally a Taylor series expansion. 


8.5.1 Theorem (A.-L. Cauchy) Let I be an open interval and f € C® (I, R) . Sup- 
pose that there are two constants M > 0 and 6 > 0 such that 


sup pe (x)| < M8"n! 


xel 
for alln € N. Then 
oO e(n) 
f@= > OP e-or 


n=0 
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for all x, a € I such that |x —a| < 1/6. 


Proof Indeed, by Taylor’s formula, 


n (k) 
/ e xf @ & af 
k 


k! 
=0 


. (x = ay} 


PO) 
7 | (n+ 1)! 


<M (5|x—al)"*! > 0 


for all x € J such that |x — a| < 1/6. 


The status of endpoints of the interval of convergence of a power series is clarified 
by the following theorem. 


8.5.2 Second Abel’s Theorem (Abel’s Limit Theorem) /fthe complex power series 
0 Cn (Z — 20)” converges at z1, then the series converges uniformly for every z 
in the line segment from z, to zg, that is, for z in 


[zo, z1] = {C1 —A)zo + Az: A € [0, 1}. 


Proof Let ¢ > 0. Since the series >7,,..9 Cn (Z — Zo)” converges at z;, there is an 


index N such that 
q 


> en @ — 20)"| < &/2 (8.4) 


n=p+1 


for all indices g > p > N. Let z € [zo, z1]. Then z = (1 — A)z + Az, for some 
2 € [0, 1], so that by Abel’s Partial Summation Formula, for all g > p > N, we 
have 


q q 
don @— 20)" = >) en (ei — 20)" 2" 
n=p+1 n=p+1 
q-1 n 
= > (2" — amt) ba Ck (Z1 — zo)* 
n=p+1 k=p+1 


q 
+14 > ck (Zi — z0)*. 
k=p+1 


Taking into account (8.4), we get 


q q 


E E 
>> Cn (Z — 20)” 5 2 Qan-a"tl) 4292 <e 


n=p+l n=p+1 
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for all g > p => N and all z € [zo, z1]. The conclusion of the Theorem 8.5.2 follows 
now from Cauchy’s Criterion on uniform convergence. O 


8.5.3 Corollary Letf be a real-valued continuous function, defined on the interval 


[a, b]. If 
f(x) = DP ene — a)" 
n=0 


for all x € [a, b) and the series is convergent at x = b, then 


(ee) 


£0) = >) exlb—a)". 


n=0 


Proof Indeed, the function f and the sum S of the power series are continuous on 
[a, b] and f(x) = S(x) on [a, b). The equality at b is obtained by taking the limit as 
x —> b. 


As an application of the above results, we will prove the Maclaurin expansion of the 
function f(x) = log (1 +x): 


2 x3 


log(I+x)=x- S++ for all x € (—1, 1]. (8.5) 


(n) (n= 1)"4! eee cea. 
Indeed, f(x) = (eee 7s for all > 1, so that the series in the right-hand 


side is the Maclaurin series of f. Since for every 5 € (0, 1), 


1 n 
(n) 
sup If (x)| < (5) n!, 
xe(—8,8) 1—6 


we infer from Theorem 8.5.1 that the formula (8.5) holds for all x in (—1, 1). The 
Leibniz Test implies that the Maclaurin series is also convergent at x = 1. Thus, by 
Corollary 8.5.3, we conclude that (8.5) holds also at x = 1. 

A particular case of (8.5) is Euler’s formula for log 2, 


log2 = > ae (8.6) 


whose nice feature consists in obtaining the nth binary digit of log 2 without knowing 
its previous binary digits. Let 


lo) 
log 2 = O.djdpd3... = >) 2-* dy. 
k=1 


If we want to compute the (n + 1)th digit d,+1, we multiply by 2”: 
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gn log2 = aor faa Bie ee ae + ee 
= djdz...dn-dnyidn42-..- 


Thus, dni, = 0 if the fractional part of 2” log 2 is smaller than 2~!, otherwise, 
dn+1 = 1. The computation of the fractional part of 2” log 2 is based on the formula 


, 00 gn-k n gn-k 60 1 
{2"og2}=) > f=). ct) 2 pes 
k 


=1 k=1 k=n+1 


n [o.@) 
2"-k mod k 1 
27 = 2s eee 
: k kQk-n 


=1 k=n+1 


Since $ = pyaar ial < +. if n is large this sum makes only a small 


contribution. Remembering that we only need to determine whether {2” log 2} is 
smaller or larger than 2~!, we simply compute enough terms in S to settle the question. 

For example, if we want to determine the 11th binary digit of log2, we have to 
consider the following computation: 


2! modk] Sa i). fi t, © ; 
: pe k Nr as oan ae at) 

44 521 = 1 
mig + 6864 + a positive number less than T12 
= 0.774315 961 815}0 

+ a positive number less than 0.008928 571 42819 


= 0.11000010...2 + 0.00000001 . . .2 = 0.110000. . .2 


Thus, the 11th to 16th binary digits of log 2 are 1, 1, 0, 0, 0, 0. 


8.5.4 Remark The derivative of an even function is an odd function and vice versa. 
Therefore, the Maclaurin expansion f (x) = >", . 9 ¢nx" of an even function is a series 
where all coefficients with odd indices are zero (that is, c} = c3 = c5 = --- =O), 
while in the case of an odd function all coefficients with even indices are zero (that 
is, Co = C2 =C4 =--- =O). 


Exercises 


1. Consider the function ; 
=1/x if #x 0 
_ je if x 
FO) =| 0 ifx=0. 
(a) Prove that f is differentiable and 


2 —1/x? if 
/ _ ee ifx £0 
ro=|s 0 eG: 
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For the computation of f’(0) one can use Corollary 8.3.6. 


(b) Prove by induction that all derivatives of f exist and the nth derivative has 
the form 


3 2, 
FO) = | tere ifx #0 
0) ifx = 0, 


where P;,(x) is a suitable polynomial of degree at most 2n. 


(c) Conclude that f (0) = f’ (0) = f” (0) = --- = 0, so that the Maclaurin 
series of f has all coefficients equal to 0. 
2. Prove that the function 


2 : 
gE). at |x ed 


fo =| 0 if |x|} >1 


is of class C™ and infer from this fact that for every pair of real numbers a < b, 
the function 


I/(x—a)(x—b) Gf (a, b) 
e ux Ee (a, 
Sa,b(%) = | 0 ifx € R\ G,b) 


is also of class C™. 
[Hint: Use a bijection between [—1, 1] and [a, b].] 
3. Justify the Maclaurin series expansion 


for x € (-1, 1). 


Note Since the mapping x > (1 + x) /(1 — x) is a bijection from (—1, 1) onto 

(0, 00), the above formula provides a convenient way of computing the natural 

logarithm of any strictly positive number. 
4. Find the Maclaurin series expansion for the functions sin? x and cos? x. 

[Hint: See the formula 4 cos? x = 3 cos x + cos 3x and the similar one for sin? x.] 
5. Justify the Maclaurin series expansion 

es 
arctanx =x— —+—-—.--- forallx € [—1, 1], 


and show that 
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6. Prove that 


CO 


1-3---(Qn—1 2n+1 
arcsinx =x + > e ) — for all x € [—1, 1]. 
2-4---(2n) 2n+ 1 


n=1 


[Hint: At x = 1, use Exercise 5, Sect.2.1, and Corollary 8.5.3.] 
7. Prove that the Maclaurin series expansion, 


3 1 1 1 xn 
—Ind —x)Ind +x) = (: + 2 naet ) 
a 2° 3 2n—1}) n 


is valid for all x with |x| < 1. 
[Hint: Use the result of Exercise 2 at the end of Sect. 2.3.] 

8. (L. Euler). Let p(m) be the number of all triplets r,s, t of nonnegative integer 
numbers such that r + 2s + 3t = n. Prove that 


1 
dQ—-n0d—-x20—x 


ao dpm", for x € (—1, 1). 
n=0 


8.6 The Differentiation of Power Series 


The next theorem gives a sufficient condition under which the derivative of the limit 
equals the limit of the derivatives. 


8.6.1 Theorem (K. Weierstrass) Let (f;)n be a sequence of real-valued differentiable 
functions defined on an interval I, that is pointwise convergent to a function f and 
for every point ain T, there is anumberr > 0 such that Ga is uniformly convergent 
on I,(a) = IN (a—r,at+r). Then, f, — f uniformly on each of the sets I-(a), 
the function f is differentiable and its derivative equals the pointwise limit of the 
sequence ae In other words, 


! , 
(Jim In) = lim f.. 
noo n—> Oo 


Proof Let a € I arbitrarily fixed and let r > 0 be as in the statement. By the Mean 
Value Theorem, 


(fm(x) —fm(a)) — Fn(x) — fr(@))| S |x — a] - oe fPO=—A@l, (7) 


ze[a,x 


and thus 
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lim) — fn(*)| < Ufa) — fn(a)| + |x — al - ae Rio = GO| 
ze[a,x 


<n —fa@|+r- sup (ff, (2) -—f, | 


zel,(a) 


for all x € J,(a) and all m,n € N. Therefore, f, — f uniformly on /,(a). 

Since uniform convergence implies pointwise convergence, the hypotheses assure 
the existence of a function g : J > R such that f’ > g pointwise. 

To finish the proof, we need to show that f is differentiable and f’ = g. To prove 
this, let a € J and ¢ > 0. Since f; — g uniformly on each of the sets /;(a), there is 
an index N such that 


nO, =e 
for all x € [,(a) and all m,n > N. Taking limit as m — oo and making x = a, we 
get 
\g(a) —fi(a)| < €/3 foreveryn>N. 
On the other hand, from (8.7) we infer that 
g 
lf) —f@ —fv@) +fnv(@| 3 |x —al 
for all x € [,(a). Since fy is differentiable at a, there is r’ € (0, r) such that 


liv (x) — fv(@) — fi(@a — a)| < 5 Ix — al 


for all x € J, (a). Consequently, 


lf) —f(@ — gM —a)| 
< lf) —£@ —fn@ +fv@l + liv) — fy @ — fy(@a — a)| 
+ | (fy(@) — g(a) & — a)| 


<elx—al 


for all x € I (a), hence f is differentiable at a and f’(a) = g(a). 


The previous theorem can be reformulated in terms of series: 


8.6.2 Theorem (The term-by-term differentiation of series of functions) Let >), ofn 
be a series of differentiable functions defined on an interval I. If >)... fn is pointwise 
convergent to a function f and for every a € I, there is r > 0 such that Doble 
uniformly convergent on I,(a) = IN (a—r,a+r), then the series >”. fn converges 
uniformly to f on each of the sets I,(a), the function f is differentiable and 


(Sx) =>4. 


n=0 n=0 
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8.6.3 Corollary Let >’,,.9 Cn(x—a)" be a power series whose radius of convergence 
is R > 0. Then, its sum S(x) is infinitely differentiable on the interval (a — R, a+ R) 


and 
k CO Ak 


d d 
at S@) = » ar lene — 0)"] 


for allk € N and all x € (a—R,a+R). Moreover, the coefficients cy are given by 
the formulas 
sa) 


n\ 


, neN. 


Ch = 
These formulas show that the power series expansion of a function around a point 
is unique. 


Proof This follows from Theorem 8.6.2 by induction, noticing that at each step, the 
radius of convergence of the new series is still R. 


8.6.4 Corollary Every real analytic function f : U > R is infinitely differentiable. 
Moreover, for every point a € U and every positive number R with the property that 
(a—R,a+R) CU, we have 


oO e(n) 
fM= +s Oe — a)" forx € (a—R,a+R). 


n 
n=0 


Applying the result of Corollary 8.6.3 to the power series expansion 


1 
a Sie os, Se Ci 
1-x 


we infer that 


Tow? Lt ert 3 te, xe(=1, 1) 


and this procedure can be continued indefinitely. 
Based on Corollary 8.6.3, we will prove the generalized binomial formula, 


(+x)%=14 (e+ (5)e 4... forxe(-l,l)andaeR. (8.8) 


The combinations (*) are given by 


(§) = “aid ()-ee Rene. 
0 n 


n!} 


It is worth mentioning that for a > 0, the binomial series is actually absolutely 
and uniformly convergent on the interval [-1,1]. 


8.6 The Differentiation of Power Series 247 


Clearly, the radius of convergence of the binomial series is 1. According to 
Corollary 8.6.3, its sum S, (x) is differentiable on (—1, 1) and an easy computation 
shows that 

(1 +x)S) (x) — aSq(x) = 0 


that is, 
[a +x) “Sy@)] =0 


on this interval. Therefore, 
(1 + x)~*Sa(x) = (1 + 0) *Sq(0) = 1, 
whence Sy (x) = (1 + x)® for x € (—1, 1). 


When a > 0, we have to consider the sequence ay, = |(*)| . Since Sett =a! 
n 
forn > a, we have 


Nay — (N+ Ian) = Aan = 0. 


Thus, the sequence (nd,), is positive and eventually decreasing, hence convergent. 
This shows that the series >”. a, is convergent, which by the Comparison Test 
implies the absolute and uniform convergence of the binomial series on [—1, 1]. 
Both sides of the formula (8.8) represent continuous functions on [—1, 1], that are 
equal on (—1, 1). Hence, the equality holds everywhere on [—1, 1]. 


Exercises 


1. Find the Maclaurin series expansion of the function 


1 3 
ro) = (7**) 
—x 


on the interval (—1, 1). 
[Hint: Notice that 


l+x\P_ 14 2 gare 6x 1? 8 
l—=xJ l-xJ L=x (=x? (=x? 


2. Infer from the generalized binomial formula that 


1 2 
arcsinx = .3 nl er for all x € [—1, 1]. 
n>0 


3. (The method of undetermined coefficients). We want to find the coefficients of 
the Maclaurin series expansion 


log(1 + x) 


2 
Hcotcx+cox +---. 
1+x . 
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Write it as log(1+-x) = (1 +x) (co + c1x + cx” + - --) and identify the coeffi- 
cients based on the uniqueness of power series expansions (see Corollary 8.6.3) 
to conclude that 


geri 1 1 1 


. (The power series method). An important method to find the solution of a dif- 


ferential equation like 
ul’ + dru =f (x) 


is to expand the right-hand side in Maclaurin series and to seek for solutions of 
the type u = °° 9 cpx". The coefficients c, are determined by identifying the 
corresponding coefficients in the power series of the two sides. In the end, we 
analyze the convergence of the resulting series for uw. 

Find, using this method, a solution of the equation 


u’ +u=x—sinx. 


. Prove that the Maclaurin series expansion of (arccos x)* for x € (—1, 1) is 


2 oo 2 
Pane: ‘ [2-4-...-(Qn—2)]? ,, 
=—- 2. 
(arccos x) Z UX +x + »z (ny! 
eS [13+ Qn=DP ong 
(2n + 1)! ; 


n=1 


[Hint: It is far too complicated to determine and use the higher order derivatives 
of this function. It is better to use a method derived from the one in the previous 
problem. First, we need a differential equation. Let y = (arccos x)*. Then 


, __ —2arecos x 
Va 
whence (1 — x”) (y’ y — 4y = 0. By differentiating this last formula once more, 
we get 
—2x(y')* + 2(1 — x)y’'y" — 4y’ = 0. 


Since the derivative of y is never equal to 0 on (—1, 1), we infer that 


da xy" =2/ —2=0, 
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To this differential equation, we add initial conditions: y(0) = = and y’(0) = 
—z. We seek solutions of the type y = beat cnx". Substituting this y in the 
equation, we get 


CO CO 
(1 — x’) > n(n — 1)cpx"? — > Ncpx"” —2 = 0, 
n=0 n=0 
hence 
2c2 =2 
(n+2)(n+ l)cn42= ncn, forn>1. 
From the initial values, we find that cg = m*/4 and cj = —7.] 


5. Find the Maclaurin series expansion of the function 


fw 2x arccos x (1,0 
5) ee —4 > x a . 
(1 — x?) 1/2 


6. (L. Euler). The mise Junction is the function defined by the power series 
Lip (x) = 02, 4 for |x| < 1. Prove that (Lig(x))’ = —°&5 and 


ae? 
Lig (x) + Lig(1 — x) = - — logxlogd — x). 


8.7 Bernoulli Numbers and Polynomials 


The following three familiar formulas concerning the power of consecutive integers, 


ye me) 


_ nat DQn+ 1) 


k=1 6 
we _ wnt)? 
k=1 2 


lead naturally to the problem of finding the general formula for the pth power. This 
was solved by Jacob Bernoulli using a special class of numbers and a special class 
of polynomials that now bear his name. 
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The Bernoulli numbers By, are the coefficients of the Maclaurin series expansion 


ee ee eee for |x| < 27. (8.9) 
e—] 1! 2! 


The fact that this power series expansion is indeed valid in the interval (—2zr, 277) 
makes the objective of Exercise 1, at the end of this section. 
Taking into account the series expansion of e* and the product of series, we infer 
that 
Bo | Bi Bn Bn-1 Bo 


B = 1, =0, Sen ———_ = 0, 
: TET in taeDl° GeO 


so that the Bernoulli numbers B, verify the recurrence formula 


Bo =1and(")Bo+(")B, +---+( ” )B,, =0 forn > 2. 
0 1 n—-1 


Therefore, 
Bo=1, By = 1/2, Bo=1/6 Bs =-—1/30, Bo = 1/42,... 


and 


B3 = Bs = By =-++=0. 


The last formula is motivated by the remark that 


represents the power series expansion of an even function (which forces that the 
right-hand side contains no odd powers of x). 
By taking into account that 


2/2 —2z/2 hz 

z ve z ev“ +e z cosh 5 
=~ 5° =7° for all z € C\{0 
e@—-1'2 2 e/2—e2” 2° sinh or all z € C\{0} 


and 
cosx =coshix, sinx = —isinhix. 
we obtain the following power series expansion of x cot x : 


cosh ix 2ix 2ix Bon 9 
xcotx = ix =— + = 4)" me 
sinhix e2* —] 2 pm ) 


8.7 Bernoulli Numbers and Polynomials 251 


Next, we will present an alternative approach built on the formula xcotx = 
xa (log sin x) and the infinite product expansion of the sine function. 


8.7.1 Theorem (Euler’s infinite product for the sine function) For every real 
number x, 


sinx=x [] (1-, aa) 


n=1 


This formula extends Wallis’ product for 2, which represents the particular case, 


a 
= -T] 2n 
-_ n=1 Inti)” 


where x = an 
n=] 


Proof Notice first that sin(2n+ 1)@ is a polynomial of degree 2n + 1 in sin 0. Indeed, 
by combining the de Moivre’s Formula and the Binomial formula, we get 


cos(2n + 1)6 + isin(2n + 1)6 = (cos6 + isin@)?"t! 
2n+1 


2 1 - 
= = 2 (oe oe ) coo sink 


whence, by equating the imaginary parts, we obtain that sin(2n + 1)@ equals 


on+1 onl 
ae Jr = sin? oy" sino ~ ( > Jer = sin? yt sin? 9 


2. 1 
+(-1)" . i) sin?”+! 9 
n 


= sind P, (sin? 6) ; 


where P,(x) is a polynomial of degree n with integer coefficients. If 6 is a root of 
sin(2n + 1)@ = 0 such that sin@ # 0, then 6 is ey a root of P, (sin? 0). 


Since 2n + | distinct roots of sin(2n + 1)@ are + sin AZ, , fork =0,...,n, we infer 
$2 oD Oe 2 nw 
that the roots of P,,(x) are sin Mar? Sim” styo sin” = TT: Therefore, 


n . 2 
9 
sinQn + 1)6 = Csind TT (: _ irr) 


k=1 sin 2n+1 


where C is a constant that can be computed by dividing both sides by sin 6 and then 
passing to the limit as 6 — 0. We obtain C = 2n + 1. By performing the change of 
variable 6 = we arrive at the identity 


x 
2n+1? 
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sinx ie sin? aa 


Tet k=1 sin” 557 


Suppose that x > 0 and fix arbitrarily two integers m and n such that x < m <n. 

We will estimate the right-hand side product in (8.10) by using Jordan’s inequality. 
This inequality asserts that 2a/7 < sina < a for0 <a < 7/2. See Exercise 4, 
Sect. 8.2. Denoting by a; the kth factor in this product, we infer that 


Pe 
0<1- 25 <a <! form <k<n, 
which yields 
n n 2 2 n 
x x 1 x 
1l> [] «=> II (1 aa)>! 4 > QZ or 
k=m+1 k=m+1 k=m+1 
Hence 
a) ‘ 2 
Tl F sin’ 74 e sinx . (1 =) Tl = sin” 57 
a sine ft (2n + 1) sin Gel 4m bal ine zs 


and passing to the limit as nm > oo, we deduce that 


ls x sin x x = x2 
1 > >{1 1-—— }. 
I] ( =) a ( ~) Il ( a) 


k=1 k=1 


The proof ends by letting m — oo. 


See Exercise 10, Sect. 12.2, for an alternative proof. 
Euler’s Product Formula allows us to relate the power series expansion of x cot x 


to Riemann’s zeta function 


ee) 


tix) = >> = forx > 1. 


n=1 
See Exercise 7, Sect.6.8, for the well definedness of this function. We have 


2 


d . au x 
xcotx = a (log sinx) = 1 2 > aq? 


n=1 


x4 x6 
=1-2 wee 
> (seat sent eet) 


= 
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2n 


=1-2>°= cn), 


qn 
n=1 


via Dirichlet’s Summability Theorem and Theorem 8.6.2 (on the term-by-term dif- 
ferentiability of series of functions). By taking into account the uniqueness of series 
expansion (Corollary 8.6.3 above), we recover Euler’s miraculous computation of 
the values of zeta function. 


8.7.2 Theorem (L. Euler) For all positive integers n, 


1 1 g2n—-1 7 2n 
£Qn) = 1+ oa + ga t= a Baal (8.11) 
In particular, 
1 1 1 
(Q=lt+ytyteuat 
(4)=1+ : + : + = : 
sO) = 24 © 34 ~ 90° 


Another proof in the case of ¢(2) is given at the end of Sect. 9.3. 
The Bernoulli polynomials can be introduced in the same manner as coefficients 
in the following power series expansion, 


XZ 


loo) 
xe Bn) n 


a 
n=0 


where x is a real parameter. 
The Bernoulli numbers are related to the Bernoulli polynomials by the formula 


Bn = Bn (0). 


Using the decomposition 


7 [o,e) CO 
zen Zz = Bn y ae 
= eS es —t 
e-1l e&-| (x? ae 
n=0 n=0 


we infer the formula 


B,(x) = > ({) Bu" (8.12) 


In particular, 
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Bo(x) = 1, 

1 
By(x) =x- 5 
Bo(x) = pee ae 

6 
B3(x) = Pa “7 + 
1 

Ba(x) = x7 (1 —x)? — ap 
Bs) = x5 — Sx 4 323 — x 
Beo(x) = = 39 + 2x4 a + 5 
By(x) = x! r6 + tS re + ax 


From (8.12), we immediately infer the following derivative formulas for n > 2: 


Bi (x) = nBn_-1 (x), 
BO (x) = nl. 


They are instrumental in establishing the Euler—Maclaurin Formula. See 
Theorem 10.2.3 below. 
Using the Maclaurin expansion (8.9) and the identity 


e@tDz zexe rf 
Ze, 


e — | os a 


we obtain 


s a ee 3 x" nt 


nN. 
n=0 n=0 


so by Corollary 8.6.3 we are led to the formula 
Ba(x +1) — Bax) = mx", n> 1. (8.13) 
For x = 0, this yields the equality 
B,(0)= Bnd), n> 2. (8.14) 
This allows us to consider the periodic Bernoulli functions, 


B(x) = B(x — |xJ), forxeR. 
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B; (x) is the extension by periodicity (of period 1) of the restriction of B;(x) to [0, 1). 
Thus 


B(x +k) = B(x) forallxe R,k € Z, 


8.7.3 Theorem (Bernoulli’s Power Sum Formula) For all positive integers m 
and p, 


ve = Bp+i(m + 1) —- Bp+1 
p+l 


Proof Replace x by k and n by p+ | in formula (8.13), and then sum over k from 0 
to m. 

All computer algebra programs (like Maple and Mathematica) have incorporated 
this formula that allows us to evaluate exactly large sums such as 


1000 
pa k!9 = 91 409 924 241 424 243 424 241 924 242 500. 
k=1 
Exercises 
1. Infer from Theorem 8.7.2 that !22x1 = ¢(2n) < forn > 1, and 


(Qn)! Qn Gn (Qn oe ’ 
conclude that the Maclaurin expansion (8.9) is valid for all x € (—2z, 27). 
2. Prove that (—1)""!B>, > 0, for all n > 0, and thus the nonzero Bernoulli 
numbers alternate in sign. 
3. Prove the formula B,(1 — x) = (— 1"Bn (*), forn > 0. 


[Hint: Use the identity z= ge il 
4. Infer from the formula 


tanx = cotx — 2 cot 2x 


the power series expansion 


g2n g2n | 
tanx = > — [Bon| x7"! for |x| < 2/2. 
n=1 
5. Using the relation 
1 
— =cotx-+ tan . 
sin x 2 


prove the power series expansion 


Qn _ 2 
—— “de ir ( Gy for |x| <z. 
sin x (2n)! 
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8.8 The Weierstrass Approximation Theorems 


Though not every function of class C° admits a Taylor series expansion, all continu- 
ous functions can be uniformly approximated (on compact intervals) by polynomials. 


8.8.1 Weierstrass Approximation Theorem For each continuous function f : 
[a,b] > R, there is a sequence (Pn) of polynomial functions which is uniformly 
convergent to f on [a, b]. 


Proof By making a change of variable, x = (1 — t)a+ tb, we may restrict ourselves 
to the case of functions defined on [0, 1]. 
In this case, f is the uniform limit of its Bernstein polynomials, 


Bilfsx) = of (*) ({)a ayn, 
k=0 


Indeed, notice the combinatorial identities, 


n 


> ({)ea =f oi 
k=0 


de({ea — 1)" = nt 


k=0 

ki n 
ee ({)ea —1)""* = nt( —t + nt). 
k=0 


The last two identities can be deduced from the binomial formula 


n 


nN) k 
= 1 hs 
>> (7): +2 
k=0 
by taking derivatives up to second order and making the substitution z = t/(1 — fr). 
Lett € [0,1] and d > O. Tf A = {ke {l,...,n}: |t—k/n| < 6}, then 
If (t) —f(k/n)| < wp (6), for all k € A. Fork € {1,..., n}\A, we have 


ho -f (*) < (1 +5 


which follows from the absolute value inequality by adding 1 + li f - ‘ 


‘}) 5 
a af (8), 


| equidis- 
tant intermediate points between ¢ and k/n. 
Therefore, taking into account the combinatorial identities above, we get 


8.8 The Weierstrass Approximation Theorems 257 


n k n 
FO - Bilfi 1 < >> ho —f (*) > ({)ea -yr* 
k=0 is k=0 
< wy(8) [ e > |k aa ({)ea = | 


k¢A 
“(k= nt)? 
< oF (5) c Ze ae (‘jaa = om] 
k=0 


< oy(8)| 14+ C5” < oy) (1+ 75) 
=n a 4nd? )’ 


n 


which easily yields the conclusion. 
By considering separately the real part and the imaginary part of f, one can extend 
Theorem 8.8.1 to the case of complex-valued continuous functions. In that case, the 
approximating polynomials will have complex coefficients. 
The proof of Theorem 8.8.1 also shows that if f is a C* function, then 


dk dk 
qe Ball t) > qt 
uniformly on [0, 1]. 

Unfortunately, the convergence of the sequence of Bernstein polynomials is slow: 
8.8.2 Voronovskaya’s Theorem /f f is twice differentiable on (0, 1], then for all 
x € [0, 1], 


1 
mn [f (x) — Biff; x)] = =a a xf" (x). 


See the paper of Coleman [2] for an elementary proof. For f € C*({0, 1]), this 
result is nothing but an easy consequence of Taylor’s formula. 

The connection between the rate of approximation by Bernstein polynomials and 
the smoothness of the involved function is discussed in [3]. 

Weierstrass’ approximation theorem has a companion for continuous periodic 
functions of period 277, which was also proved by Weierstrass. In this case, the role 
played before by polynomials will be played by trigonometric polynomial of period 
27. A complex-valued trigonometric polynomial is a function of the form 


nN 
T(x) = > cyel™, 


k=—n 
where the coefficients cx, are complex numbers. 


8.8.3. Weierstrass Approximation Theorem (The case of continuous periodic 
functions) Let f : R — C be a continuous periodic function of period 21. Then, 
for any & > 0, there is a trigonometric polynomial T,;(x) such that 
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sup |f (x) — T;(x)| < €. 
xeR 
In the real-valued case, the polynomial 7; (x) can be chosen of the form 7; (x) = 
dg+ pa =| (ax cos kx + by sin kx), where all coefficients a; and by are real numbers. 


Proof Consider first the particular case where f is even. According to Theorem 8.8.1, 
for every € > 0 there exists a polynomial P(x) such that 


sup |f(arccos y) — P(y)| < ¢, 


lyl<1 
so, letting x = arccos y, we get 


sup |f(x) — P(cosx)| < é. 


xeE[0,7] 
Since f and P(cos x) are even, we conclude that 


sup |f(x) — P(cosx)| < «. 


xe[—72,7] 


Suppose now that f is odd. For 5 € (0, 7/2), put 


ae f (2252) it0<8<x<0-5 
iy = 
0 if0<x<dorm—d<x<z 


and extend g to an odd function on [—z, z] by using the formula gs (x) = —g5(—x) 
if —z < x < 0. Notice that the function so obtained is continuous. Because f is 
uniformly continuous, we may choose 6 > 0 such that 


sup f(t) — go@)| < 5. 


xe[—7,7] 
The function 


gs(x) 
sin x 


hs(x) = 


for x € (—z, 7)\ {0} and hs(0) = h(—z) = h(r) = 0 


is even and continuous, so by the discussion above, we can choose a trigonometric 
polynomial Q such that 


sup |hs(x) — QQ] < =: 


xe[—z,7] 


whence 
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sup |gs(x) ~ Q@)sinx| < 5. 


xe[—2,7] 


Therefore 


sup [f(x) — Q@) sina] 


xe[—7,7] 
< sup [f(%)—gs(®)|+ sup |gs(x) — Q@) sina < €. 
ba 


xe[—z, xée[—7,7] 


In the general case, we use the decomposition of f as the sum between an even 
function and an odd function, 


= FQ) tf (—x) 4 Ff) wi (=%) 


Some important applications of Theorem 8.8.3 are presented in the Sects.9.8 
and 12.1. 


Exercises 


1. Prove that if f is increasing on [0, 1], then the Bernstein polynomial B,(f) is 
also increasing on [0, I]. 

2. Prove that every continuous function f : [a,b] — R can be represented as the 
sum of a uniformly converging series of polynomials. 

3. Prove that the exponential function cannot be approximated uniformly on R by 
polynomials. 

4. Infer from the Weierstrass Approximation Theorem that the Banach space 
C([a, b]) is separable (that is, it contains a countable dense subset). 

5. (Approximation via interpolation). Consider a function f : [a,b] — R anda 
family of points a = x9 < x1 < +++ <X, = b, called nodes of interpolation. 
(a) Prove that there exists exactly one polynomial of degree n, that equals f at 
each of the points x;. This is called the polynomial of interpolation of f at the 
nodes x; and has the form 


n 


Pw=>\( [] —{]reo. 
di 


i=0 \O<j<n, j4i~" 


(b) Assuming that f is of class C”*!, prove that for every x € [a, b], there exists 
€ € (a, b) such that 


frre) 
f (x) — Py(x) = @+D! I] (x — xj). 


O<i<n 
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Fig. 8.6 A convex function y y 
and a concave function 


8.9 Convex Functions 


In this section, we will discuss a class of real functions with a great geometric insight, 
that appears frequently in the concrete applications of mathematics. 
As above, J will denote a nondegenerate interval. 


8.9.1 Definition A function f : J > R is called convex if for every x, y € J and 
every A € [0, 1], 


(CL — A)x + Ay) <  — AMF) + AF). (8.15) 


Moreover, if the inequality is strict whenever x # y and A ¢€ (0, 1), the function is 
called strictly convex. 

The function/ is called concave (strictly concave) if —f is convex (strictly convex). 
The functions that are both convex and concave are precisely the affine functions. 


All results concerning convex functions have a companion for the concave func- 
tions, obtained by reversing the inequalities. 

Geometrically, the convexity of a function means that every line segment joining 
two points on its graph does not lie below the graph at any point. For concavity the 
graph is always above the line segment. See Fig. 8.6. 

The function f(x) = x” (n € N) is strictly convex on [0, oo) for n > 2 and affine 
for n = 0 and n = 1. The absolute value function is convex but not strictly convex. 

The set Conv (J) of all convex functions on J is closed under addition, multipli- 
cation by positive numbers, and taking maximum. So, every function of the form 


m 
P(x) =axt+b+ > cg |x — ag| (8.16) 
k=1 


where a, b, ax € R and cx > O, for all k, is convex. In particular, the positive part 
and the negative part functions are convex. 

More examples of convex/concave functions will be presented below, based on 
differential calculus. 

We can extend the inequality of convexity to the case of convex combinations of 
points. By definition, a convex combination of points x1, x2, ...,X, € Ris any point 
of the form 
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n 
> NKXK 


k=1 


where Aj, A2,...,An € [0, 1], and S77_, Ag = 1. 

The basic remark is that any nonempty interval J is closed under convex com- 
binations. This can be proved by induction (over the cardinality of the family of 
points). For n = 2, this is exactly the property of convexity of J. Assume now that 
the property is true for all convex combinations of n points and we need to show it 
is true for convex combinations of n + | points. The nontrivial case is when all the 
numbers Aj, A2,..., An+1 belong to (0, 1). Then, by our hypothesis, 


n+1 n 
> Anak = > AKak + AntiGn+1 
k=1 k=1 
” r 
k 
=(1-A ——_ x el. 
( oS = a) + Ant14n+1 


8.9.2 Theorem (Jensen’s Inequality) [ff : I — R is a convex function, then for 
every a, 2,...,d, € 1 and every d1, 42, ...,An € [0, 1] with Ys Ape = 1, 


AX ia) < > asf (ax). 
k=1 k=1 


In the case of strictly convex functions, if all the coefficients 41,42, ...,An are 
nonzero, then the above inequality becomes an equality only if a, = --- = dy. 

A similar result works for the (strictly) concave functions, reversing the inequality 
sign. 


Proof We will use induction. The case n = 2 is exactly the definition of convexity. 
Assume now that the inequality is true for all convex combinations of n points and 
we need to show it is true for convex combinations of n + 1 points. The nontrivial 


case is when all the numbers Aj, ..., An+1 belong to (0, 1). Then, by our hypothesis, 
n+1 n 
(> sat) -(> Aca + inent 
k=1 k=1 
n rn 
k 
= 1-2 a ny 
1( med = ig oF wnent 


Xr 
<(d- sai ( toa) + Antif (Gn+1) 
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< (1 Anti) — Today fe) + Anef nt) 
k=1 
n+1 


= >> f(a). 
k=1 
Suppose now that the function f is strictly convex and 


A(X sor) = >) df (ax) (8.17) 
k=1 


k=1 
for some points a},...,d, € J and some scalars Aj, ..., A, € (0, 1) that sum up to 
1. Suppose a1, ..., @, are not all equal. Then, S = {k : a, < max {a),...,ap}}isa 
proper subset of {1,...,} andAs = yk cs Ak € (0, 1). Since f is strictly convex, 
we get 


” ie Me 
r( sa) =1(isf a) +(d- io +) 
2 pare 2, 1—Asg 


a iv(Z Ha) +(1—-As)f e 4a) 


keS k¢éS 
n 


< as fw) +(1- oes = > Af (x) 


bes k¢S k=1 


which contradicts our hypothesis (8.17). Therefore, all points az should coincide. 


In the case of differentiable functions, convexity means that at every point, the 
tangent line is a support line of the graph, in the sense that all the points of the graph 
are on the same side of the tangent line (i.e., in the same semi-plane). See Fig. 8.7. 
The details make the objective of the following result. 


8.9.3 Theorem Let f : 1 — R be a differentiable function. Then: 
(a) f is convex if and only if 


f@>=f@+t+f' (@(«—-a) forallx,a €l, 
(b) f is strictly convex if and only if 


f@>f@t+f'@a—a) forallx,a ¢l,xfa. 


Proof (a) Let us see first that the condition is necessary for a convex function. If 
x,a €I, withx ~£a, anda é€ (0, 1], then 
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we a? 


f(at+ AQ — a)) = f(A — Aja + Ax) 
SU -AF@ + AFH) =f@ + aF@) —f@). 


Fig. 8.7 A support line 


Therefore, 


f(a + A(x — a)) — f(a) re 
A(x — a) 


a) < f(x) -f@. 
Taking the limit as 2 — 0, this implies that 
f'@@—a) <f@)—-f@. 
For the sufficiency, let x, y € J and A € [0, 1]. Then, by the hypothesis, 
f(x) = f(U — A)x + ay) +7 — Ada + Ay)Ae — y) 
and 
fQ) = f(A —A)x + Ay) — f(A — A)x + Ay). — A) — y). 
The last two inequalities yield 
(1 —A)f@) + AfO) = f(A — A)x + Ay) 


and thus the function f is convex. 
(b) To see the necessity, we proceed as in part (a) to get 


age OI ey 
A(x — a) 


From part (a), we obtain that 
fata — a) -—f@ =f @ArAa — a). 
Therefore, 


f@)>f@+f'@e- a). 


For the sufficiency part repeat the above argument in part (a) of this theorem. 
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Now, we can state the second derivative test of convexity: 


8.9.4 Corollary Let f : 1 — R be a twice differentiable function. 

(a) If f ” = 0, then the function f is convex. 

(b) ff" > 0, and the set of all points where f" vanishes does not include a 
nondegenerate interval, then the function f is strictly convex. 

Similar results work for the (strictly) concave functions, by changing the inequality 
signs. 


Proof (a) If f” = 0, then f’ is increasing. By the Mean Value Theorem, for every 
x,a € TI, there is a point c between x and a such that 


fM=f@+-aAf' ©, 


hence f (x) > f (a)+ (x—a)f’ (a) . The conclusion follows from Theorem 8.9.3 (a). 
The proof of (b) is similar. 


Some easy consequences of the second derivative test of convexity are as follows: 


the function e* is strictly convex on R; 

the function x“, with a > 1, is strictly convex on (0, 00); 
the function tan x is strictly convex on [0, aii 

the function log x is strictly concave on (0, 00); 

the function sin x is strictly concave on [0, zr]; 

the function cos x is strictly concave on [—7, 5]. 


Most of the usual functions admit a partition of the interval of definition such 
that on each partial interval they are either convex or concave. Examples are the 
functions sin and cos; use the second derivative test of convexity. The points where 
a continuous function changes the convexity are called points of inflection. 

In the case of logarithmic function, Jensen’s inequality yields the following impor- 
tant application: 


8.9.5 Theorem The Weighted Arithmetic Mean—Geometric Mean Inequality 
(L.J. Rogers) Ifx1,...,Xn € (0, ©) and ay,...,a, € (0, 1), i a, = 1, then 


n 
yee ap oe, 
k=l 


with equality only when x, = +++ = Xn. 


This inequality expresses the fact that the weighted arithmetic mean of a set of 
positive numbers is greater than or equal to their weighted geometric mean. 
Replacing each x, by 1/x, in the above inequality, we obtain that 


1 
1 2 eee On es 
Xy X5 Kee 


n= n ak? 
Deis 
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the inequality being strict unless x] = x2 = --- = X,. This expresses the fact that 
the weighted geometric mean of a Set of positive numbers is greater than or equal to 
their weighted harmonic mean. 

If we choose equal weights a} = a2 =---=Q,= ‘, we get the usual Arithmetic 
Mean—Geometric Mean—Harmonic Mean Inequality, 


Xp +X2++++ +X n 
= A/X1X2++ + Xn = 1 T i ° 
x} + x2 nf o Xn 
The two inequalities are strict except the case where x) = x2. =--- = Xp. 


What are the differential properties of a convex function? Let us notice first that 
a convex function may not be continuous and that if it is continuous may not be 
differentiable. The function 


ae ( if x=0orl 
0 if xe (0,1) 

is an example of a noncontinuous convex function. The function f(x) = |x| is an 
example of a nondifferentiable continuous convex function. We will see next that 
these possibilities are very limited, though. Namely, the convex functions are contin- 
uous at every interior point of the interval and they have finite one-sided derivatives 
at these points. As a consequence of this, we will get that the set of points where a 
convex function is not differentiable is countable. The proof is based on the monotone 
property of the slopes of the lines passing through a fixed point of the graph. 


8.9.6 Lemma (L. Galvani) [ff : J > R is a convex function, then for every points 
x<y<zinI, we have 


FH) -fO) _ fO-fO) _ fO-f/O) 


y-x Z-X ~  Z-y 


Proof Since y is between x and z, it is a convex combination of them, that is, 
y=(1-A)x+Az 


for some A € (0, 1). Replacing y in the inequalities above and doing a little algebra, 
we get the inequality defining the convexity of f. 


8.9.7 Theorem (O. Stolz) [ff : 1 > R is a convex function, then f has finite left 
and right derivatives at every interior point a of I (in particular, f is continuous at 
a). Moreover, if x < y are interior points of I then 


¥ Go =f 00 270) =F 6 


Proof Let x; < x2 <a <uinI. Then, by Lemma8.9.6, 
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FON -F@ _fGa-f@ _FW-f@ 


Xj —a ~ x2 —a ~ u—a 


(8.18) 
Thus, if (x,), is a Sequence increasing to a, 


exists and does not exceed 


km 2 oe ~f@ 
8 $$ S$ 
=&4 


n—>0o Xn 


fw —-f@ 
aa 


=4 


This implies that ff (a) exists, is finite, and does not exceed fw)f@) | A similar 
argument shows that f(a) exists, is finite, and f/ (a) < f/(a). The continuity of f at a 
is now straightforward. Using the inequalities (8.18), we can prove that f/ (x) < f/(y), 
for x < y. 


As pointed above, at the endpoints of the interval of definition, the function may 
not be continuous. Nevertheless, the discontinuities can be only of the first kind. See 
Exercise 2 below. 


8.9.8 Corollary The set €, of all points where a convex function fails to be differ- 
entiable is countable. 


Proof If x € €&, then the open interval ly = (f/(x),f/(x)) is nonempty. By 
Theorem 8.9.7, if x A y then J, 1 J, = Y. Thus, choosing for each x € € a rational 
number r; € /,, the application taking x € E into r, € Q is injective. 


By Theorem 8.9.7 and Exercise 3, Sect.6.4, every differentiable convex function is 
actually a function of class C!. 

A stronger property than convexity is log-convexity. A function f : > (0, oo) is 
said to be log-convex if 


x,y € J and A € [0, 1] implies f((1 — A)x + Ay) <f@)! FO, 
that is, if log f is convex. If a function f is log-convex, then it is also convex. In fact, 
F(A —A)x t+ Ay) SFO)" FON Ss A - NFO) + FO). 


In a natural manner, we can introduce the concept of a log-concave function. 
Notice that e* — 1 is convex and log-concave. 


Exercises 


1. Show that the following functions are strictly convex: 
(a) —logx and xlogx on (0, 00). 
(b) x? on [0, 00) ifp > 1. 
(c) —x? on [0, co) if p € (0, 1). 
(d) (1 +.x?)!/? on [0, 00) if p > 1. 
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Ww 


. If f : 1 > Risa convex function, then show that either f is monotone or there 


is a point c € J such that f is decreasing on (—oo, c] NJ and increasing on 
[c, 00) NJ. Using this property, show that a convex function g : [a, b] > R has 
finite limits at the endpoints of the interval [a, b]. 


. Prove that the graph of a strictly convex (respectively of a strictly concave) 


continuous function cannot contain three collinear points. 


. Let J be an open interval. Prove that a function f : J > R is convex if and only 


if for every a € J, there is a real number A such that 
f@) =f(@+A@ — a) 


for all x € J. In other words, the convex functions on / are exactly the functions 
having a support line at each point. 

[Hint: For sufficiency, proceed as in the proof of Theorem 8.9.3. For necessity, 
show that the inequality holds for all A € [f/ (a), f/ (@)].] 

Note Under the assumptions of Exercise 4, we can associate to a convex function 
f a subgradient, 


af ={g:I1>R: 9) €[f)(@).f;Q@)], for every x € I} 


which provides a valuable substitute for the differential. 


. (The Variational Theorem for convex functions). Let f be a continuous convex 


function defined on a nondegenerate interval J. Prove that 


fx) = sup {f (a) +f (ax - ay:ae int J} 
= sup {f(a) +A(x — a): a € intl, A € Of (a)} 


for all x € J. 

Note By the Variational Theorem, every continuous convex function is an upper 
envelope of affine functions. Notice that the converse is trivial. Precisely, if 
f. : I > RQ é€ A) is a family of convex functions bounded above by a 
function g : J > R, then the upper envelope, 


f(x) = sup fi), 


is also a convex function. 


. (Convex Mean Value Theorem). Consider a continuous convex function f : 


[a, b] — R. Prove that 


Monto Ho) € af(o) 


for some point c € (a, b). 


. Let f be a function defined on an open interval. The symmetric upper derivative 


of order two of f at x is defined by 
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Dyf x) = tim suplO tM +FO =H) = 240) 
h->0+ h 


(a) Prove that if f is twice differentiable at x, then f” (x) = Df (x). 
(b) Show that D2f (x) can exist even if the function is only continuous at x. 


. (The characterization of convex functions). Let f be a function defined on an open 


interval J. Prove that f is convex if and only if f is continuous and Df (x) > 0 
on/. 
[Hint: If D2f (x) > 0 on I, prove by contradiction. In the general case, consider 


the sequence f,(x) = f(x) + =| 


. (The Hardy—Littlewood-Polya Inequality, as generalized by L. Fuchs). Let f : 


[a, b] > R be a convex function and consider points x1,...,%n, Y1,---;¥n in 
[a, b] and real numbers pj, ..., Py Such that 


(ayxy >...>%, Wi >.-. > Vn 
r r 


(b) > pexe < > prey foreveryr=1,...,n—1; 
k=1 k=1 


n n 
() } pexe= > prve- 
k=1 k=1 
Prove that 


DS refx) < >° pe fOr). 


k=1 k=1 


[Hint: Notice that we may restrict to the case when all points x, and yx are 
interior to [a, b]. Then, use the subdifferential and the Abel’s partial summation 
formula. ] 

(Popoviciu’s Inequality). Suppose that f : J > R is a convex function. Show 
that for all x, y,z € J, 


torre 1f (==) 


2 x+y yz Z+x 
we a : 
=a (ora) as (er) acd 
What does this inequality mean in the case of the exponential function? 
[Hint: Clearly, we may assume that x > y > z. Then (x+ y)/2 > (z+ x)/2> 


Qytz)/2 and x= @+yt+2/32zIfx > @+y+2/3=y = z, apply 
the Hardy—Littlewood-Pélya Inequality to the families 


XpHX, XH HR = X= (X+V4+2)/3, W=y, MHZ 
yi=y2=(+y)/2, yu =y4=(X4+2)/2, Ys =o = (V+2)/2 


The case when x > y > (x+y+z)/3 => zis similar.] 
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11. Suppose that f, g : [0, co) — R are differentiable convex functions such that 
both f and 1/g are convex. If Jim n SS = = £, prove that 


f'@) 


x00 g' (x) 


12. Use the concavity of the function log (sin x / x) to prove that in any triangle ABC, 


3 
sinA sinBsinC < (3V3/2) ABC. 


8.10 The Extrema of Convex Functions 


The continuous convex functions have very nice extrema properties. They attain their 
maximum at the boundary and every local minimum is a global one. Moreover, a 
strictly convex function may admit at most one minimum. 

Fermat’s Theorem provides a necessary condition for an extrema. By 
Theorem 8.9.3, the points where the derivative of a convex function vanishes are 
points of global minimum. This remark is the basis of a very useful criterion for the 
local extrema of a function of class C?. 


8.10.1 Theorem /ff € C*(I,R), then every point a € I such that f'(a) = 0 and 
f"(@ > 0 is a strict local minimum. 


Proof By Taylor’s formula (as stated in Remark 8.4.2), 


FO=FO@tF@w-gt+ EOF Gay 
= f(t LOte® a? 


where jim w(x) = w(a) = 0. Thus, there is r > 0 such that |@ (x)| < f” (a) /2, for 


all x e Us =IM(a—r,a+r). Clearly, U is a neighborhood of a (in the relative 
topology of 7) and 


re-f@zFO wa? >0 


for allx € U, x £4, hence ais a strict local minimum. 


Since the derivatives have the intermediate value property, the realm of Theo- 
rem 8.10.1 is that of strictly convex functions of class C?. 
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8.10.2 Theorem Let f :R — R be a continuous function such that 
lim f(x) = lim f(x) =~. 
x00 X—>0O 


Then f has a global minimum. 
If in addition f is strictly convex and differentiable, then the point where f attains 
its global minimum is the unique solution of the equation 


f'x) =0. 


Proof Letm = inf f(@). Then m € [—ov, 00), and there is a sequence (x;,), such 
xE 


that f(%,) — m. Now consider an arbitrary subsequence (yy)y Of (Xp)n that has 

a limit, say a. The existence of such subsequences is motivated by the Bolzano— 

Weierstrass Theorem. See Sect.2.5.Since lim f(x) = lim f(x) = o, necessarily 
x—>—CO x—>>0O 


a is different from —oo and oo, hence a € R. Then 
m= lim f(vn) =f(@), 
noo 


and thus, a is a point of global minimum. 
The second assertion in the statement follows from Theorems 8.3.1 and 8.9.3(b). 


An algorithm for localizing the point of global minimum of a strictly convex 
function is suggested in Exercise 4. 


Exercises 


1. (a) Prove that every local minimum of a convex function is a global one. 
(b) Prove that a convex function f defined on an open interval J attains a global 


minimum at a if f/(a) < 0 and f/(a) = 0. 
2. Find the extrema of the function 


x x2 xl = 
f@={fl+—+ =4+...+—Je, xe[0,o). 
1! 2! n!} 
3. Prove that the minimum of the function 
a b Cc 


f(a, b,c) = 


+ + 
Vae+8be Vb2+8ca Vc? + 8ab 


for a,b,c > Ois 1. 
4. Let f : [a,b] — R be a strictly convex function and let t € (0,1/2) bea 
parameter. Prove that the knowledge of the sequence 
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2 2 


fa), f (PERO) (I 


).10 


allows us to localize the point of global minimum of f in a subinterval of [a, b] 
whose length does not exceed (4+) (b— a). 

5. Suppose that f : [a,b] — R is a convex function, n is a positive integer and 
S € [a, b]. Prove that 


sup [f 1) +f G2) +--+ +f0n)] 
X1,X2,+0.Xn€La,b],x) +x2 ++ +-xn=nS 
is attained at an n-tuple {x1, x2, ...,xX,} having n — 1 components equal (to a or 
to b). 


8.11 Nowhere Differentiable Continuous Functions 


How far from differentiability can a continuous function be? As far as possible. 

Bolzano was the first mathematician to discover the existence of nowhere differen- 

tiable continuous functions (so to speak, functions having sharp points everywhere). 
His example starts with the tent map 


_f x if0<x<1/2 
POT ia oe GIG 2 ye 


which is extended by periodicity to the whole real line. Thus, we obtain a continuous 
and periodic function go, of period 1. For n > 0, the functions 


1 n 
en) = 3, G0(4"x) 
are continuous and periodic, of period 4~”. They are differentiable except for the 
points p/(2 - 4”), where p is an integer and n is a natural number. 

Let g(x) = ye G(x) for x € R. Using the Weierstrass M-test (Theorem 6.8.6 
above) we deduce that the function g is continuous. Nevertheless, the function @ is 
nowhere differentiable. To see this, let x9 € IR, m € N* and put h,, = +47”. Then, 
forn > m, 


Qn(X0 + lin) — Gn(Xo) _ 


0. 
Nin 


The function g,—, is differentiable on intervals of length 2/4”, and the interval of 
length 2/4” containing x9 also contains one of the intervals 


1 1 
(»- 5.) or (x0.0+ 5). 
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On such interval the functions go, ... , @n—1 are differentiable and their derivative is 
either —1 or 1. Therefore, 


(xo + ie 9 (xo) = >! Qk (x0 + “= — Q(x0) 


__ | anevennumber if m is even 
~ | anodd number _ if m is odd. 


It is easy to see now that the limit 


g(xo + Am) — Go) 


m—>0o hin 


does not exist. 
The above example is not an accident. In a certain sense, most continuous functions 
are nowhere differentiable. This was noted by S. Banach in 1931. 


8.11.1 Theorem (S. Banach) The set D ofall differentiable functions in C({0, 1], R) 
is a Set of first category. 


Proof For each integer n > 1, consider the set 


fa@+h-—f@ 
t 


Mn = fi € C((0, 1], R): sup 
O<t<1/n 


1 
<n for some x € o, 1- alt 
n 


By using the Bolzano—Weierstrass Theorem, it is easy to show that all the sets M,, 
are closed. They are also nowhere dense. 

Indeed, assume the contrary. Then there is an integer n > 1, a function f € M, 
and a number ¢ > 0 such that 


g € C((O, 1], R), lg —f ll, < ¢ implies g € My. 


By the Weierstrass Approximation Theorem, there is a polynomial P, such that 
If — Pellog < &. Letd =e — ||[f — Pell. Then 


g € C(O, 1], R), llg — Pellog < 6 implies g € M,. (8.19) 
Next, we consider a piecewise linear function / : [0, 2] — R such that 
Illoo < 8, A(x) and |h!,(x)| > n+ |lhllgo for all x € [0, 1). 
By (8.19), we have h + P, € M,,. On the other hand, 


|(A+ Pe), | = [AL @) + PL@| = |AL@)| —|PL@)| > 2 
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at all points of [0, 1). Thus, 4+ P, ¢ M,. This contradiction proves that int MV, is 
empty for all n. Consequently, D Cc U,, Mn is a set of first category. O 


The argument of Theorem8.11.1 may yield even more striking results. See 
Exercise 8, Sect. B1. 


Exercises 


1. Prove that Bolzano’s example of a nowhere differentiable continuous function 
is not monotonic in any nondegenerate interval. 

2. Let J be an open interval and let f be a real-valued function defined on J. Prove 
that there exist only countably many points x € J such that f; (x) and f/(x) both 
exist (possibly in R) and they are not equal. 

3. (K. Weierstrass). Let a be an odd positive integer and b a real number such that 
0 <b < 1andab > 1+ 32/2. Prove that the function 


f@w= by bk cos(a‘ xx) 


n=0 


is continuous on R and nowhere differentiable. 
4. (Hardy [4]). Prove that the function 


oO. 2 
FONE > ny 


n=1 


(though continuous on R) is not differentiable at any irrational value of x. 
Note Gerver [5] proved that the only points of differentiability of this function 


are those of the form x = 4 , fork,m € Z. 


8.12 Notes and Remarks 


Many mathematicians contributed to the invention of Calculus, long time before 
Gottfried Wilhelm von Leibniz and Isaac Newton. Among them was Pierre de Fermat, 
who imagined a tangent being a secant line that met the curve twice at the same point. 
If the curve is the parabola y = x? and the point is P of coordinates (a, a*), then a 
secant line through P has the equation y — a* = m(x — a). The value of the slope m 
for which the intersecting points coincide is precisely m = 2a. As a was arbitrarily 
chosen, the above reasoning emphasizes a rule relating the given function y = x* to 
a new one, m = 2x, describing the slopes of the tangent lines. 

The notation a for the derivative, as well as the chain rule and the rule of differ- 
entiating a product of functions, are due to Leibniz. 

The first textbook on differential calculus was published by the French mathe- 
matician Guillaume de |’ H6pital in 1696, under the title Analyse des Infiniment Petits 
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pour I’Intelligence des Lignes Courbes (literal translation: Analysis of the Infinitely 
Small for the Understanding of Curved Lines). That book includes the famous rules 
on indeterminates of which he learned from Johann Bernoulli. Euler’s influential 
book Introductio in analysin infinitorum (Introduction to the Analysis of the Infi- 
nite) appeared in 1748. However, Augustin-Louis Cauchy was the first to stress the 
importance of rigor in analysis. His book Cours d’analyse de I’Ecole royale poly- 
technique (1821) had a great impact in the epoch, but the first who introduced the 
modern concepts of limit, continuity, and differentiability were Bernhard Bolzano 
(1817) and Karl Theodor Wilhelm Weierstrass (1859/1860). 

The following formula for the nth derivative of a composition of functions was 
first noticed by Louis Frangois Antoine Arbogast in 1800 and rediscovered later by 
many other mathematicians. 


8.12.1 Faadi Bruno’s Formula Suppose thatf and g are two n-times differentiable 
functions and the composition g o f makes sense. Then 


! n O) mj 
(g of) (x) = > a ni’ ”) 


j=l i 


where the sum is over all n-tuples (mj, ..., mn) of nonnegative integers satisfying 
the constraint 
lem +2-mo+---+n-m =n. 


An n-tuple (m1, ...,™n) as above is called an integer partition of n. They were 
introduced in mathematics by Euler and play an important role in number theory and 
combinatorics. 

Theorem 8.3.2, which asserts that the derivative of every differentiable function 
has the intermediate value property, was proved by Jean Gaston Darboux in his 
Mémoire sur les fonctions discontinues, Ann. Sci. Ecole Norm. Sup., 4 (1875), 
57-112. There, he noticed for the first time the existence of discontinuous func- 
tions having the intermediate value property. 

The Bernoulli-L’ Hépital Rules were the subject of many investigations. In [6], 
one can find the following result: 


8.12.2 Theorem Suppose that f,g : (a,b) — R are two differentiable functions 
such that g' (x) # 0 for every x. If the limit lim L® exists, then the limit lim £2 
xa 9) xa IO) 


exists too. 


Boas [7] showed how to deduce the Bernoulli-L’ H6pital Rules via the generalized 
Riemann integral and extended them for other operators of derivation. Ostrowski [8] 
discussed a number of generalizations that allow g’ to vanish on null sets. 


8.12.3. The Monotone Form of Bernoulli-L’H6pital Rule (Mikhail Gromov; see 
[9], Lemma III.4.1, p. 134) Let f, g : [a, b] > R be two continuous functions which 
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are differentiable on (a, b). Further, let g’ # 0 on (a, b). Iff'/g! is strictly increasing 
(decreasing) on (a, b), then the functions 


Ff) —f@ 
x > 


g(x) — g(a) 
f@) —f(@) 
gx) — g(b) 


are also strictly increasing (decreasing) on (a, b). 


For applications, see the paper of Anderson et al. [10] (and the references therein). 

The Taylor series were introduced by Brook Taylor in 1715, but (typically for 
his times) he neglected the convergence aspects. Taylor’s formula with remainder 
(as stated in Theorem 8.4.1) was proved in 1772 by Joseph-Louis Lagrange, who 
considered it as the main foundation of differential calculus. The analysis of the 
convergence of the Newton—Raphson iterative method is also due to Lagrange. 

Bernoulli polynomials and Bernoulli numbers are named after the Swiss mathe- 
matician Jacob Bernoulli, who introduced them in his book Ars Conjectandi, pub- 
lished posthumously in 1713 at Basel. These concepts play an important role in many 
diverse areas of mathematics, for example in number theory, in combinatorics and 
in analysis (see the Euler-Maclaurin summation formula in Sect. 10.2). 

Bernoulli’s inequality (see Sect. 8.2, Exercise 1) is also due to Jacob Bernoulli. 

The history of the whole contribution of Leonhard Euler to Riemann’s zeta func- 
tion can be found in the paper of Ayoub [11]. 

The elegant proof of Theorem 8.8.1 (The Weierstrass Approximation Theorem) 
is due to Bernstein [12], who used a probabilistic argument for the combinatorial 
identities on which the proof is based. These identities show that B, (f; 1) > f 
uniformly on [0, 1], for f € ae Xx, cae Taking into account that B, (f; t) > 0 if 
f = 0, and for every f € C((0, 1], R) and every ¢ > 0 there exists a positive number 
C such that 

If) -fO)l<e+C@-yy 


whenever x, y € [0, 1], one can show easily that B, (f; t) > f uniformly on [0, 1], 
for every f € C((0, 1], IR). This alternative proof of the Weierstrass Approximation 
Theorem is due to Pavel Petrovich Korovkin. See the paper of Niculescu [13], for 
details and generalizations. 

In 1914, Herman Miintz established an unexpected connection between approx- 
imation theory and the divergence of series. Precisely, if Ag = 0 < Ay < A2 <-::- 
is an increasing sequence of real numbers, then the vector space generated by the 
monomials x*« is a dense subset of C (0, 1], R) if and only if 
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See his paper Uber den Approximationssatz von Weierstrass, published in H.A. 
Schwarz’s Festschrift, Berlin, 1914, pp. 303-312. 

An elementary proof of Voronovskaya’s Theorem is available in the paper of 
Coleman [2]. 

An important generalization of Theorem 8.8.1 is due to Marshall H. Stone. His 
result, known as the Stone—Weierstrass theorem, generalizes the Weierstrass Approx- 
imation Theorem in two directions: instead of the real interval [a,b], an arbitrary 
compact Hausdorff space X is considered, and instead of the algebra of polynomial 
functions, approximation with elements from more general subalgebras of C(X) is 
used. See the monograph of Hewitt and Stromberg [14], Theorem 7.30, p. 95. 

High-degree polynomial interpolation at equidistant points can be troublesome. 
This is known as the Runge phenomenon . Consider for example the function 


1 


f@) = 14 25x2” 
interpolated at n + | equidistant points x, = 2h —1(k =0,...,n) between —1 
and 1. Increasing n, the absolute value of the error between the function and the 
interpolating polynomial, is approaching oo. The problem can be avoided by using 
spline curves, which are piecewise polynomial. 

The subject of convex functions is intensively studied in books like that by 
Niculescu and Persson [15]. The chapter on generalized convexity is nicely comple- 
mented by the paper of Anderson et al. [16]. New extensions of Jensen’s Inequality 
can be found in the papers of Niculescu and his collaborators [17-19]. 

How discontinuous can a derivative be? By using the approximation formula 


f@ + 1/n) —f@) 
1/n 


> fOr) 


asn — oo, we infer that a derivative is necessarily a Baire class one function. Taking 
into account Theorem 6.11.1, we conclude that the points of continuity of a derivative 
constitute a dense Gs set. One can prove that a necessary and sufficient condition 
that a set E C [a, b] be the set of discontinuities of a derivative is that EF be an Fo 
set of the first Baire category. See Bruckner and Leonard [20]. 

Banach’s argument on the existence of continuous nowhere differentiable func- 
tions can be used to answer many other problems. Consider the Banach space 
D({a, b], R) of all functions f : [a,b] — R with bounded derivative, endowed 
with the norm 


fll = sup FOOl+ sup [f’@)]. 
x€[a,b] xe[a,b] 


Denote by Co([a, b], R) and Do([a, b], R) the class of all functions f € C([a, b], R) 
andf € D({a, b], R), respectively, that are monotone on no interval. Then, Co([a, 5], 
R) is a residual in C([a, b], R), while Do({a, b], R) is a nowhere dense set in 
D({a, b], R). See Salat [21]. 
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We next present an example of homeomorphism 
F : [a,b] > [u, v] 


which is differentiable, strictly increasing, and has a bounded derivative which equals 
0 ona dense subset of the domain. This example is a variation of the one by Pompeiu 
[22]. 

Let [u, v] bea fixed interval and let (y,,),>1 be a dense subset of [u, v], with distinct 
terms. Let >", cn be a convergent series of positive terms. By the Comparison Test, 


the series 
> Cay _ ya 'P 


n 


is absolutely and uniformly convergent on [u, v] and thus defines a continuous func- 
tion on [u, v] that will be called ®(y). 
Since it is a sum of strictly increasing functions, © is strictly increasing too. Let 
a = ®(u) and b = ®(). By Theorem6.4.6, ® is a homeomorphism of [u, v] onto 
[a, b]. Let 
F=o!, 


We will show that F has all the required properties. 
8.12.4 Lemma Consider the function WV : [u,v] > [0, co], defined by 
Wy) 1 > Cn 
Veo) ona 
3 — yn) 


(a) If V(y) < ©, then © is differentiable at y and ®'(y) = V(y). 
(b) At every y where V(y) = 00 and at every yn, the function ® has infinite 
derivative. 


Proof (a) Let y € [u,v] andh € [u — y, v — y] (so that y+h € [u, v]). Since ® is 
strictly increasing, it follows that POTH)— PO) > 0 for all h 4 0. Put 


eine 1/3 i 1/3 
Pn = Co) =(1+ ) . 
Y—Yn Y—Yn 


[ee 


Then 


O(y +h) — &(y) Cn 4 Cn 
0 < ———_— = < 
h 2 (2+ n+ IQ — yn)? 3 a (y — yn)? 


n=0 n=0 


which implies that POT PO) 


[u—y,v—y]. 


is uniformly convergent with respect to h on 
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Therefore 
d(y +h) — © = 
lim Po +h — 20) = lim > 2 si 2/3 
h>0 h h>0 nn (Ce + Gn + I — ya) / 


x4 > Cn = WG) 
3 4 ayy 


(b) Let € be a point where W(£) = oo. For each m € N, the function 
m 


Cn 
= 2 (E +y — yn) + E +y — yn) FE — yn) 3 +E — yn 


is continuous at 0. 
Let M > 0 arbitrarily fixed. Since 


m 
Cn 


jim @m(0) = lim. 2 @n yp = 
there is a natural number NV; such that @,(0) > M, for all m > Nj. The continuity 
of @, at O yields a number 6,, > 0 such that 
yeé[u,v], |y| <6, implies @,(y) > M. 
On the other hand, 


OE +y)— OE) 


lim @n(y) = 
m—>0o y 


so that there is a natural number N2 such that 


Oey) = Oe) 
y 


Om (y)| < € 


for all m > Nz. Let N = max{N, No}. Then, form > N and |y| < dm, 


oC +y) =O) 
é< 9 


M —€ < a@,(0) 


and thus 


km PE +N — OE) _ 
im — = © 
y>0 y 
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When y = yy, we have 


[e.e) 


Piy+h)—P) — cn 3g & 
where - 
a= (thon) 
Yn — Yk 
Therefore, 


Oy +h)— OO) tn 
h h2/3 


which implies that 


li P(y +h) — BC) 
im — = © 
h>0 h 


By Theorem 8.1.4, the function F is differentiable on [a, b] and if x, = F(yn) 


then F(x) — F@) 
/ 2s jig ek or 
F'n) = = Paar = ee 00-60) — 
Y—Yn 


0. 


In the same way, F’” is 0 at the points where W is oo. Since ® is ahomeomorphism, 
the sequence (x;,) is dense in [a, b]. Since ® is strictly increasing, so is F. Therefore, 
F’> 0. 

To find an upper bound for the derivative of F’, g we only need to consider points 
y where W(y) < oo. In this case 


ijl Gi ee es 
Ror 0 ee) ei en emer 
Thus, atx = ®(y), 
Hitec 3(v — w)23 


< ‘ 
®'(y) ~ baal 


At the rest of the points, F’ = 0. This ends our presentation of Pompeiu’s example. 
Anexample of a function with a proper local maximum in each interval is available 
in the paper of Posey and Vaughan [23]. 
Other interesting pathological examples can be found in the books of Gelbaum 
and Olmsted [24, 25]. 
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CONN 


Chapter 9 
The Riemann Integral 


Since antiquity, people were interested in computing the length of curves, the area 
of surfaces, and the volumes of solids. 

Step by step, a general method of solving all these problems was developed. It 
was called Integral calculus and appeared in an intimate connection with Differential 
calculus. 

Having at hand the concept of length of an interval, we can immediately define 
the integral of a constant function and then the integral of a piecewise linear function 
over a compact interval in such a way that when dealing with positive functions, we 
get the area under the graph. See Fig. 9.1. 

The different extensions of the integral to larger classes of functions led to a large 
variety of sets that can be “measured”. 

In this chapter, we will present the Riemann integral. Although limited by some 
very severe restrictions (it applies only to bounded and “almost” continuous functions 
defined on compact intervals), the Riemann integral has the big advantage of being 
both simple and large enough to cover many interesting facts. 

In the next chapter, we will describe an important extension of Riemann inte- 
gral, the Lebesgue integral, whose main feature is the completeness of the space of 
integrable functions. 


9.1 The Integral of Regular Functions 


By a division of a (nondegenerate) compact interval [a,b], we mean a finite set 
A = {xo, X1,%2,...,Xn} such thata = xp < x1 < x2 < -:+: < xX, = b. The 
intervals [x;,, x,41] are called the partial intervals associated to the division A. Any 
two subintervals are either disjoint or have in common an endpoint. If A; and A2 are 
two divisions of [a, b], we denote by A, U Ag the division determined by all points 
in the two divisions. 

In what follows, a division of J is seen either as a collection of nonoverlapping 
subintervals, or as a finite ordered set of division points. 
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Fig. 9.1 Area under the graph 


Fig. 9.2, A step function 


9.1.1 Definition A function f : [a,b] — R is called a step function if there is a 
division A of [a, b] such that its restriction to the interior of every partial interval is 
constant. 


See Fig. 9.2. 
If f is a step function and f(x) = cy, when x € (xx, xp41), then we define the 
Riemann integral of f by the formula 


n—1 


In(f) = Do cerns — xk) 


k=0 


The above definition is in fact independent of the particular division considered. 
If f is a step function with respect to two divisions A; and Az, then it is easy to see 
that f is a step function with respect to Ay U Ag. Moreover, Ja, = Ta, = Tayuay- 
The best way to see this is by induction adding to A, one point at the time until we 
obtain A; U Az. When we do this, at every step, some term c;,(xp4.1 — x;) will be 
replaced by cy, (xp4.1 — x) + c%(x% — xz) and the two quantities are equal. 

Therefore, it is natural to denote the above integral as [(f) instead of J, (f). To 
highlight the interval of integration, we will use one of the symbols 


b 


b 
If) or [feo or [ra 


a 


to denote the Riemann integral of f over the interval [a, b]. 
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By definition, the integral over an interval reduced to one point equals 0, that is, 


a 


J fare. 


a 


The set St({a, b], IR), of all real-valued step functions defined on [a, b], is a linear 
lattice of functions. 

The function J : St([a, b], R) — R, which associates to each step function the 
value of its Riemann integral J 4 ( f), is called the integration functional. The main 
properties of the functional of integration are: 


(INT 1) (Linearity) For all «, B € R andall f,g € St({a,b],R), 


1? (wf + Bg) = al? (f) + BI? (g). 


(INT2) (Positivity) If f € St ({a,b],R) and f > 0, then I” (f) > 0. 
(INT3) (Additivity) If f € St ([a,b], R) and c € [a, b], then 


(fy =I (f+ 12 (f). 
(INT4) (Calibration) 1 (1) = b — a. 
9.1.2 Remark From (INT1) and (INT4) we infer that 
1? (C) = C(b—a) 


for every constant C, while from (INT1) and (INT2), we infer the property of 
monotonicity of the functional of integration, 


f <g implies 17 (f) < 12(9). 
Therefore, if f is bounded and m < f < M, then 
m(b—a) <1? (f) < M(b—a). 


A consequence of this double inequality is the property of boundedness of the func- 
tional of integration: 


Ppl sds 
< (b= a)- supl|f (2)| +x € a, b)). 


9.1.3 Remark The additivity formula (INT3) can be extended by induction to any 
finite number of intermediate points. 
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9.1.4 Remark The definition of the integral does not depend on the values the func- 
tion takes at the endpoints of the interval. Therefore, if f, g € St(({a, b], R) differ 
only at finitely many points, then 


rf) = 1°). 


To see this, leta < to < t) <---t, < b be all the points where the functions are 
different. Then 


Pf) = ICP) + ABP) to FE CP) = 12g) + GQ) +--+ E@ = EC). 


The boundedness of the functional of integration gives us the possibility of ex- 
tending it to larger spaces which, as we will see, include C([a, b], R). 


9.1.5 Definition A function f : [a,b] — R is called regular if it is the uniform 
limit of a sequence of step functions. 


Every regular function is bounded. See Exercise 1. According to Theorem 6.8.3, 
a regular function has countably many points of discontinuity. 

The set R([a, b], R) of regular functions f : [a,b] — R constitutes a linear 
lattice of functions with respect to the usual algebraic operations and the pointwise 
order. Clearly, 

St([a, b], R) C R({a, b], R). 


9.1.6 Theorem (N. Bourbaki [1]) A function f : [a,b] > R is regular if and only 
if it has only discontinuities of the first type. 


Therefore, R([a, b], R) contains all continuous functions and also all monotone 
functions on [a, b]. 


Proof Necessity. Suppose that f : [a,b] — R is a regular function. We will show 
that f has finite right-handed limit at any z € [a, b). Indeed, due to the fact that f is 
regular, there is a sequence (f;,), of step functions, uniformly convergent to f. Let 
é > 0. There is a natural number WN such that 


LF) — falx)| i 


for every x € [a,b] andn > N. Since nm n(x) exists, there is z- € (z, b] such 
xz 


that for every x, y € (z, Ze], 
€ 
| fu (x) — fr (y)| < 3" 
Therefore, if x, y € (z, Ze] then 


If) — FMI SFG) — fy@) + lin@) — ful + lfvQy) — FMI < e- 
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According to Cauchy’s Criterion for limits, aun J (x) exists and is finite. In a similar 
way | lim J (x) exists and is finite for every z € (a, b]. Thus, f has only disconti- 


nuities van the first type. 

Sufficiency. Suppose that f is a function having finite one-sided limits at all points. 
We will prove that f is the uniform limit of a sequence of step functions. 

Let n be a positive integer and x € [a, b], arbitrarily fixed. Then, there is ry > 0 
such that the oscillation of f on (x — ry, x) N [a, b] and (x, x + ry) N [a, b] is less 
than 1/n. Put Vy = (x — ry, x + 1;). Since [a, b] C U; Vy, by the Borel-Lebesgue 
Lemma, there is a finite number of points x1, x2, ..., Xp such that [a, b] C ts Vu,- 
The points a, b, xz, max xz —ry,, minx, + ry, fork = 1,2,..., p can be put in 
increasing order to obtain a division of [a, b] such that the oscillation of f on the 
interior of each partial interval is less than 1/n. Leta = ag < aj < a2 < ++: < 
dm = b be this division. For k = 0, 1,...,m— 1, choose arbitrarily zz € (ap, ap+1) 
and consider the step function 


f(x)  ifx € {ao,..., am} 


In) = fk) if x € (ag, an+i), k € {0,...,m— 1}. 


Then 


1 
J) = Jnl one.) 


which implies that f, — f uniformly. 


The extension of the functional of integration to R([a, b], R) is done by a den- 
sity argument. Let f € R([a, b], R) and let (fr), be a sequence of step functions 
uniformly convergent to f. Given e > 0, there is a natural number N such that 


€ 
fn) — fa(x)| < 5, 
—a 
for every x € [a,b] and every n,m > N. According to the boundedness of the 
functional of integration, | J, cea In) — 1 bai Jm)| < € for every n,m > N. Therefore 
GP (fn)) is a Cauchy sequence and we define 


In(f) = lim 1? (fn). 


The above limit does not depend on the particular sequence of step functions 
approximating f. Indeed, let (g,,), be another sequence of step functions uniformly 


convergent to f. Then, the sequence fi, 91, fo, 92,-- Sn gn, --- also vere to 
f cto and thus, as above, we infer that J Al Ay, E Pg), I we ( fo), 1 (Gp), « 
ie 6 ae Pt Gn); ... 1S a convergent sequence. Therefore 


lim 1?( fn) = lim 1? (gn). 
noo n—>oo 
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I 2 (f) is called the Riemann integral of f and, as in the case of step functions, it 


is also denoted by 
b b 
| fone or | fo 
da a 


In the case of positive regular functions f : [a,b] — R, the geometric meaning 
of the integral defined above is the area under the graph, 


b 
Area ({(x, y): x € [a,b] andO <y < f(x)}) =f f(x) dx. 


The next result shows that the Riemann integral of regular functions plays the 
same basic properties as in the case of step functions. 


9.1.7 Proposition The Riemann integral of regular functions is a procedure of as- 
sociating to any compact interval [a, b] a functional 


I?:R({a,b],R) >R 
with the following four properties: 
(INT1) (Linearity) For all a, B € Randall f,g € R ([a, b], R), 
I? (af + Bg) =a? (f) + BIR (9). 


(INT2) (Positivity) If f > 0, then 1? (f) > 0. 
(INT3) (Additivity) If f € R ([a, b], R) and c € [a,b], then 


Pip=EHteg. 


(INT4) (Calibration) 7° (1) = b — a. 


Notice that both Remarks 9.1.2 and 9.1.3 extend verbatim to the framework of 
regular functions. 
In order to offer more flexibility to the additivity formula, we define 


a b 
[fava f pas (9.1) 
b a 


for every f € R({a, b], R); this allows us to state the additivity formula (INT3) in 
the following more general form: 
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Vv WwW Uv 
[face frace [ pas for all u, v, w € [a, b]. 


U U Ww 


The formula (9.1) shows that the process of integration acts over oriented compact 
intervals; when the orientation changes, the sign of the integral changes. Once again, 


a b a 
eee) deat 


9.1.8 Proposition (The stability of the integral under perturbations at finitely many 
points) Jf f € R({a, b], R) and g : [a, b] > R is a function such that g(x) = f (x) 
except for a finite number of points, then g € R({a, b], R) and 


b b 
Joare | row. 
a a 


Proof By Bourbaki’s theorem, we infer that g € R([a, b], R). If f and g differs 
only at a, then for every ¢ € (0, b — a) we have 


A) — BO) = af) — EE) + ef) — Bae) 


=I (A) - BP @l =a - ol 
<e-supt{lf (x) — g@)|: x € La, b]} 


and taking the infimum over ¢ we conclude that J : (N=I M4 (q). In the general case, 
we break [a, b] into appropriate subintervals. 


9.1.9 Theorem (Commutation of limit and integral) If (fn)n is a sequence of func- 
tions in R([a,b],]R) and f, > f uniformly, then f € R([a, b],R) and 


Life im. 7G): 
n—->oo 
In other words, 
b(): . b 
fa (tim, fn) ~ Bu fa (fn) - 
Proof Since fn € R([a, b], R), there is a step function gy such that 


i 
| fn(x) — gn(x)| < — 
n 
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for all x € [a, b]. Thus 


1 
If) — gn) SFG) — frQOl + lfnQ) — gn@)| S sup |f(%) — fr) + - 


xe[a,b] 


for all x € [a, b]. Therefore 


1 
sup [f@)—ge@)| S sup |f@)— fr@)l|+— +0 


xe[a,b] x€[a,b] 
as n —> oo. This shows that f is a uniform limit of step functions (hence 
f belongs to R([a, b], R)) and ‘be (f) = lim Pag): Consequently 
n—-> Oo 
WP) — 12 Fadl SEP) = L2G) + 2 Gn) — 12 Fa 


b b 1 
= eC) = 4, Gall + deals, > 0 


asn > &. 


The Riemann integral of regular functions is unique. 


9.1.10 Theorem The Riemann integral of regular functions is the only procedure of 
associating to any compact interval [a, b] a functional 


F°: R({a, b],R) > R 


that satisfies the properties of linearity, positivity, additivity, and calibration as stated 
in Proposition 9.1.7. 


Proof Indeed, this is clear at the level of step functions. The four properties listed in 
the hypotheses yield the property of monotonicity, that is, 


f <g implies F?(f) < F?(9). 


Then 


FRP| < Fal f0 $@—a) sup [FI 
xéld, 


for every f € R([a, b], R), and this fact assures the commutation of limits with 
the functionals F, ie . Therefore, if f € R([a, b], R) and (fn)n is a sequence of step 
functions converging uniformly to f, we have 
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b b 
FCA) = lim, FEC) = im, f trax =f fax, 
a a 


by Theorem 9.1.9. 


9.1.11 Remark (Integration of complex-valued functions) The above theory can be 
extended verbatim to the case of complex-valued functions f : [a, b] — C. We say 
that f is regular if both Re f and Im f are regular functions in the sense of Definition 
9.1.5. Since the convergence in C is equivalent to the coordinatewise convergence, 
f is regular if and only if it admits one-sided limits at each point. Moreover, 


b b b 
J feoar= f Re fonar +i f im pooar, 


An application of the theory of integration of complex-valued functions is given in 
Exercise 5. 


Exercises 


1. Infer from Definition 9.1.5 that every regular function is bounded. 

2. (Numerical approximation of integrals). Suppose that f : [a,b] — R is a con- 
tinuous function. 
(a) Prove that 


b n—l 
. —a 
| fear = jim, Dieta *) 
a 
ij b-a< k b-—a 
ee 


(b) What is the geometric interpretation of these limits? 
Note In practice, the arithmetic mean of the two formulas at point (a) gives better 
results. This is known as the Trapezoidal Rule. 

3. Let f : [a, b] — R be a Lipschitz function and n a positive integer. Prove that 


n—-1 


b- b—a)* 
/ foyax—-— x flat “=*) g OW site, 


and notice that a similar estimate works for the error in the Trapezoidal Rule. 
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4. We know that 


1 1 dl 
li — +++ +— }=I1n2; 
jim (5 +5+ +z) 7 


see Exercise 2 at the end of Sect. 2.3. 
(a) Prove that this formula implies the equality 


1 
/ dx 
1+x 
0 


= In2. 


(b) Prove that 


; . Iv é ue . 
lim {sin + sin +---+sin—)=a7I1n2 
noo n+l n+2 2n 


and find an integral that leads to this conclusion. 
5. Let f : [a, b] — R be a regular function. Show that 


2 


b b 2 b 
[ fecosta + [ fesincar < firore 


2 


9.2 The Fundamental Theorem of Calculus 
The integral calculus of regular functions is deeply related to differential calculus. 
In what follows J denotes a nondegenerate interval. 


9.2.1 Lemma Let f : I — R be a function whose restriction to any compact 
subinterval of I is regular and c € I. Then the formula 


x 


ray = | fear xel 


c 


defines a continuous function having finite one-sided derivatives at all points and 
these derivatives equal the corresponding one-sided limits of f. 


Consequently, F is differentiable at each continuity point a of f and 


F'(a) = f(a). 
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Proof The continuity of F follows from the fact that F is Lipschitz on each compact 
subinterval A C I. In fact, if x’, x”” € A, then 


” 


| F(x’) — F(x")| = [foe < |e" —2!|- sup LF @)I. 
xE 


Concerning the existence of one-sided derivatives, if a is any point in J except 
for the right endpoint, then for x > a, we have 


F(x) — F(a) 


oo, 


1 x 
— flaty= — [ro = pasar, 
x—a 


Let ¢ > 0. The existence of f (a+) yields a number 6 > 0 such that 
0 <x —a < d implies | f(x) — f(at+)| <e, 


so for every such x, 


F(x) — F(a) 1 |f 
aaaroeerrian tan flor) = [vo = J (a+) di 
x—a |x —al| 
1 
<< |*- al sup |f@)— f@t)| <«. 
|x —a| zela,x] 
Thus 
lim F@)—F@ = f(a), 
xX>a+ X= 2 


and the computation of the left derivative follows in a similar way. The statement 
about the differentiability of F at the points where f is continuous is obvious 
now. 


The previous lemma leads us to the concept of antiderivative. We will define it in 
a generalized form, motivated by the additivity property of the integral. 


9.2.2 Definition A function F : J > R is called an antiderivative of f : I > Rif 
F is continuous on J, differentiable except for a finite subset A of J and 


F'(x) = f(x) 


atallx eI \A. 
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It is important to notice that the above definition is different from the one usually 
found in calculus textbooks, where the requirement is that F’(x) = f(x) atallx € I. 

In applications, the main class of functions having antiderivative is that of piece- 
wise continuous functions. These are the functions f : [a,b] — R which are 
continuous on [a, b] except for a finite set {x1, x2,...,x%}, where there are finite 
one-sided limits. Notice that the restriction of f to any of the subintervals 


la, x1], [x1, x2], sree g Xn, b| 


is continuous except possible at the endpoints. 

The antiderivatives of a piecewise continuous function f : [a, b] > R are piece- 
wise C! functions, that is, they are continuous on [a, b], differentiable on [a, b] 
except for a finite number of points where they admit finite one-sided derivatives. 
See Exercise 5. 

By the Mean Value Theorem, any two antiderivatives of a function differ by a 
constant. The family of all antiderivatives of a function f is denoted by [ f dx or 
J f(x) dx. The following formulas hold: 


etl 
“dx = C if —l 
fe x ina ifa Fx 
1 
[cer=inx +c forx > 0 
x 
d 
| pr otans +c 
1+ x2 


[éarmerc 


[sinxas =-—cosx+C 


[ cosxas =sinx+C 


where C is the family of all constant functions. Our approach also includes formulas 
such as 


[sen xdx = |x| +C. 


A word of caution is necessary. Only in few cases the antiderivatives of elementary 
functions are elementary functions. For functions like ene ans and x*, this does 
not work (due to a theorem proved by J. Liouville in 1840). See the book of Ritt [2] 
for details. Since such antiderivatives occur quite frequently, they got the status of 


special functions. For example, 
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x 
2 72 : 
erf(x) = ae e dt (the error function) 
I 
0 


and 
a 


, sin t Sie 
Si(x) = a dt (the sine integral). 
0 
Exercises 13 and 15 illustrate two numerical methods for evaluating integrals. 


9.2.3 Fundamental Theorem of Calculus Let f : [a,b] — R be a piecewise 
continuous function and let F_ be an antiderivative of it. Then 


b 
| feoer= Po -F@= Fok. 


The last equality represents a notation. 


Proof Since any two antiderivatives of a function differ by a constant, the function 


[four 


is constant and so, equals its value at a which is — F(a). Thus 


| foar= Foo - Fe 


for all x € [a, b] and in particular for x = b. 


The previous theorem is also known as the Leibniz-Newton Formula. 
The Fundamental Theorem of Calculus has many applications. Some of them will 
be shown in the rest of the section. 


9.2.4 The Integration by Parts Formula /f f, g : [a,b] — R are two piecewise 
C! functions, then 


b b 
/ f'@)g@) dx = fg@li -{ Ff (x)g"(x) dx. 
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Proof If f and g are C! functions, then we integrate the equation 
f'g+ fo = (fay 


side by side and apply the Fundamental Theorem of Calculus to the integral in the 
right hand side. In the general case, let a = x9 < x1 < +--+ < X, = b bea division 
of [a, b] such that on the interior of each subinterval, both f and g are C!. Then 


b n—-1 Xk+1 n—1 Xk+1 
[roa ed forge =D [peogeoite =f to'ax 
a k=0 Xk k=0 Xk 

n—1 n—1 “k+l b 
=> f@sOk - >. / fg dx = fosdgtoe— f fo’ ae. 


k=0 kad. Xk 


The Integration by Parts Formula can be extended by mathematical induction as 
follows: 


9.2.5 Corollary (Repeated integration by parts) For f, g € C” ([a, b], R), 


b we b b 
/ fQe” idee pxee iN) +(-1)" : f (g(a) de. 
a k=0 a a 
A nice application of Corollary 9.2.5 (forn = p+ 1 and g(x) = (b — x)?/p!) 


is Taylor’s formula of order p with integral remainder: if f is a function with a 
continuous derivative of order p+ | on an interval J and a and x are points in J, then 


Pgh) i 
f= Pa at+ fa-orseM@a. 02) 


k! 
k=0 


9.2.6 The Change of Variable in the Integral Let f : [a,b] — R bea piecewise 
continuous function and let y : [a, B] > R be a piecewise C! function such that 
¢ (a, B]) C [a, b]. Then 


g(B) B 
: faa = | f (g(t) - g(t) dt. 
g(a) a 


Proof We will consider only the case when f is continuous and g is C!. The general 
case will follow like in the previous theorem. Let F be an antiderivative of f. Then 


(Fog) =(F' ogy’) =(f op’). 
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By the Fundamental Theorem of Calculus, we have 


B y(B) 
| feo): eat = Fe@G) - Fw@ = / f(x) dx. 
a g(a) 


In some applications, it is better to read the change of variable formula backward. 


9.2.7 Corollary Let y : [a, B] > [a,b] be a C! diffeomorphism. Then, for every 
continuous function f : [a,b] > R, 


B y(B) 
| femar= / f(x): (pT!) (x) de. 
a pla) 


Next, we will apply the fundamental theorem of calculus to show that limits and 
derivatives commute. This is an example of the great simplifications that integral 
calculus brings to differential calculus. 


9.2.8 Theorem Let ( fy) be a sequence of functions in C! ([a, b] , R) such that for 
some point c € [a, b] the limit lim f,(c) = A exists in R. If the sequence of their 
n—->oo 


derivatives converges uniformly to a function g : [a,b] — R, then the sequence 
(fn)n itself is uniformly convergent to some function f. Moreover, f € C'({a, b], R) 
and 


7 = 


Proof According to the Fundamental Theorem of Calculus, 


f= p+ } Fit) dt (9.3) 


for all x € [a,b] andn € N. For x arbitarily fixed, the right-hand side of the above 
formula converges to f(x) = A+ fie g(t)dt. See Theorem 9.1.9. Since f(c) = A, 
it follows that (fn)n converges pointwise to the function f defined by the formula 


x 


f(x) = fe) + [ gear (9.4) 


c 


By Theorem 6.8.3, the function g is continuous, so from Lemma 9.2.1 and formula 
(9.4) we infer that f € C '([a, b], R) and f’ = g. The uniform convergence of the 
sequence (f;,), to the function f follows from the inequality 
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I fn(x) — FO) Sl fn) — f+ @- a) sup lin) — 9@)I, 
ZEla, 


which results by combining the formulas (9.3) and (9.4). 


The Riemann integral has many spectacular applications. An example is the fol- 
lowing formula for z, discovered in 1995 by David Bailey, Peter Borwein, and Simon 
Plouffe: 


7 Dg 2 1 1 63) 
a 8k+4 8k+5 8k+6)" 
This formula can be deduced from the fact that 
we ie Vi 148k ae U7 148k 
a r= MC = r 
| =f De dc= > x dx 
0 9  k=0 k=0 9 
oo _ oo 
_ > 1 yh t8hyl/v2 — 9-1/2 > i ( 1 ) 
= . = e\ a 
it ak rh 16" \r+ 8k 
forr = 1,2,...,7. The commutation of the integral and the series follows from 


Theorem 9.1.9. Thus, the sum of the series (9.5) is 


1/v2 
/ “4/2 — 8x3 — 4,/2x4 — 8x5 
0 


1— x8 


Making the substitution y = /2x, we get 


1 1 1 
l6y — 16 4 Ay —8 
/ 4 4 ay= | me dy - 5) 7 dy = 1 
ye ey aly Al yr 2 ye — ay +2 
0 0 0 


Very interesting comments about the origin and the use of the formula (9.5) can 
be found in [3]. Notice that if, instead of the regular decimal numeration system, we 
consider the hexadecimal one (that is, the one in basis 16) this formula gives us the 
hexadecimals of 2, without being necessary to find the previous ones. 


9.2.9 Remark The concept of antiderivative can be easily extended to the case of 
complex-valued functions. Moreover, all the formulas in this section work in this 
more general setting. It is useful to remark that for a € C \ {0}, an antiderivative of 
gm ise Min). 


Euler’s formulas, connecting the exponential and the trigonometric functions, can 
be used to simplify the computation of some real integrals by associating them to 


9.2 The Fundamental Theorem of Calculus 297 


complex functions suitably chosen. For example, to compute 


20 
l= Pe cos(nx) dx 
0 
witha € Randn €N, we put 
20 
J= {2 sin(nx) dx 
0 
and notice that 
20 
I+iJ= / eAtiMx dy = (ener = 1) /(a+ in) 
0 
27a 
e -—1 : 
=> gage . (a = in). 


Therefore, J = a (e?”@ — 1) / (a* +n?) and J = —n (e*4 — 1) / (a? +n’). 


Exercises 


1. Compute the integrals: 


m/4 


3 
(a) f sen ax; (b) [inc tan x) dx: 


0 0 
1 1 
Ind+x)) dx 
© | eae © | eer 


2. Let f : [—a, a] — R be a regular function. Prove that: 

(a) if the function f is odd, then the a J (x) dx = 0; 

(b) if the function f is even, then [“ f(x) dx = 2 fo f(x) dx. 
3. Let f : [—1, 1] ~ R be a continuous function. Prove that 


20 


J x feeosxydx = 2 f fteosxyax 
0 


0 
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and compute the integral 


20 
X COS X 
i ON as 
5 +2cos2 x 
0 
4. Find the derivative of 


F(x)= aE xeR. 


5. Prove that the antiderivatives of a piecewise continuous functions f : [a,b] > R 
are piecewise C! functions. 
6. Let a be a real number. Prove that the discontinuous function 


a ifx =0 


Sa(x) = | sin(1/x) ifx 40 


has an antiderivative if and only if a = 0. 
7. (a) Let G(x) = arctan(./2 tan x). Compute G’(x). 
(b) Compute an antiderivative of the function 


1 
f :[0,37/4) > R, f(x) = erry 


32/4 


(c) Compute | f(x) dx. 


m/4 
8. (Wallis’s formula). (a) Let 
m/2 
iL,= / sin’xdx, neéN. 
0 


Prove, using integration by parts twice, that for n > 2 the recurrence formula 


I, = al I,—2 holds and conclude that 


(Q2n—1)!! x . (2n)!! 
bn = ——— - = while Ibn) = ———— 
(2n)!! 2 (2n + 1)! 
for alln € N. 
(b) Integrating the inequalities sin?’+! x < sin?” x < sin*”~!x on [0, 2/2], 


find that 
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a ((2n)!!)° 
—= lm 5 
2 n>oo ((2n — 1)!!)° (2n + 1) 
2:2 4-4 2n-2n 
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a 1 
=i as wen ean) (1-z5). 


(This was the first representation of z as a limit of rational numbers). 
9. The aim of this exercise is to offer a proof of Stirling’s formula, 


n! 


lim = ———— = 1 
n>O ynle—./Inn : 


which provides an efficient way to approximate the factorial for large n : 


n 


nixne "S270. 


(a) Show, using the differential calculus, that 


1/1 1 
<In(n+1)—Inn < + : 
2\n 


n+ 1/2 n+1 


! 
(b) Let a, = as . From part (a) conclude that 


n'e"./27Nn 


An 1/1 1 
0 <In < 
An+1 4\n n+l 


and then that 


(i PE Gan 


an+k Gn+1 4n+2 an+k 


for alln, k € N*. 


(c) The sequence (a,), is strictly decreasing and bounded below by 0. Let a = 
limn—+oo Gn = 0. Letting k — oo in the inequality above, infer that 1 < a,/a < 


e!/@*) for alln € N*, and thus a > 0. 
(d) Note that 


2 2 Z 
pi te 


Ms = 
N>OO Ady n>oo (2n)!./1 


Use Wallis’ formula to conclude that a = 1. 
10. Deduce Euler’s formula for log 2, 


300 


11. 


12. 
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from the identity 


restricted to x € [0, 1/2]. 
We know the Maclaurin series expansion 


CO 


arcsinx = x + > 


n=1 


123s.2nety 4c"! 
2-4...(2n) 2In+1 


for x € [—1, 1]. Substituting x = sint, we get 


oo + 2n+1 

i orem ges te t 
ee Oe ae a =), 
DAesss-n) n+l 2°2 


n=1 


Notice the possibility to integrate term by term on [0, 2/2] and infer Euler’s 


formulas, 
lee) 1 IX lo) 
x Giz 8 ae arr 


(Young’s Inequality). Let f : [0,a] — [0, f(a)] be a strictly increasing con- 
tinuous function such that f(0) = 0. Using the definition of derivative show 
that 


x f(x) 
Fe / F(t) dt + / Flq@dy —2F@) 
0 0 


is differentiable on [0, a] and F’(x) = 0 for all x € [0, a]. This gives us the 
identity 


x FR) 
ston =f fora | fo y)dy forx € [0,4]. 
0 0 


Deduce from this identity the inequality 
U Uv 
uv </ f(@ dt +/ f-'(y)dy. 
0) 0 


forallO < u<aand0 <v < f(a). See Fig. 9.3. 


9.2 


The Fundamental Theorem of Calculus 301 
y 
v <2 
Y 
» 
> 
0 u x 


Fig. 9.3. The geometric meaning of Young’s inequality 


13. 


14. 


15. 


(An application of Gauss’ Arithmetic-Geometric Mean). Not all integrals can 
be computed in compact form using primitives. An example is offered by the 
following integral 


m/2 
I(a,b)= il Va2 cos? 6 + b2 sin? 6d0, fora>b>0, 
0 


x2 


which represents one-fourth of the length of the ellipse => + ¥ = 1. See the 
note after Exercise 3, Sect. B3, in Appendix B. 


2a sint 


Prove that the change of variable sin@ = nee err 


yields the equality 


1(a,b) =1(=*, vab) 


and infer from Exercise 4, Sect. 2.2, that J(a, b) = 3M(a, b), where M(a, b) 
is Gauss’ Arithmetic-Geometric Mean of a and b. 

Prove that for every polynomial P(x) of degree at most 3 and all a < b, we 
have 


; b-a a+b 
/ P(x)dx = <a [Pay +4P (=) + Po). 


(Simpson’s Parabolic Rule). Let f € C*({a, b], R) and let Gp be a division 
of the interval [a, b] into 2n subintervals of equal length. Let y, = f(x;) for 
k € {0,..., 2n}. Then, we have the following approximation formula 


b 
b—-a 
J fen ee = 7 (uo + 2am t+ Qaanaa + yan HA et Aan) 
a 
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the error being at most 


(b—a)° 


Saar sup {IFPI x € (a, ]}. 


Compare it with the previous problem. From geometric point of view, this for- 
mula came from approximating f on each subinterval [x2,, x2442] with the 
(vertical) parabola passing through the points (x2, yok), (Xak+1, Y2k+1), and 
(X2k-+42, Y2k+2). Details are available in any calculus textbook. 

Use Simpson’s Rule with n = 2 to approximate 


1 
/ 1 A 
dx =—. 
14x? 4 
0 


Note Computing integrals can be made easier by using computer programs like 
Maple or Mathematica (that are able to perform even symbolic computations). 
In the case of the last integral, both indicate the exact value, 2/4. However, 
the reader should be aware that most integrals cannot be computed in compact 
form. 


9.3 Mean Value Theorems 


The mean value of a regular function (and more generally of an integrable function) 


: [a, b] > Ris 


b 
1 
M(f)= ta / f(x) dx. 


Geometrically, if f > 0, M(f) represents the height of the rectangle of basis 


[a, b] that has the same area as the under the graph of f. 


9.3.1 The Mean Value Theorem Let f : [a,b] > R be a continuous function and 


: [a, b] > Ra regular positive function. Then there is c € [a, b] such that 


b b 
/ Fg) ax = FO- / wete 


Proof Let 


= min and M= max : 
ae wey) 
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Then mg(x) < f(x)g(x) < Mg(x) for all x € [a, b]. Integrating these inequalities, 
we get 


b 


b b 
m [ g(xyds < [ foogtsas eM [ g(xyae. 


a 


If [? g(x) dx = 0, then the previous inequalities imply that [? f(x)g(x) dx = 0 
and thus, the relation in the statement is true for every c. If i g(x) dx 4 0, then 


ye fe FG) AE 
Pig g(x) dx 


’ 


and we have to apply the Intermediate Value Theorem. 


9.3.2 Corollary A continuous function f : [a,b] — R whose mean value is 0 is 
equal to 0 at least once. 


Using the previous corollary, we can easily prove that every trigonometric poly- 
nomial of the form 


n 


T(x) = D> (aj, cos kx + by, sin kx) 
=A 


has at least one zero in [0, 277]. 
An equivalent form of Corollary 9.3.2, useful in the study of inequalities is as 
follows: 


9.3.3 Corollary [f f,g : [a,b] — R are continuous functions such that f(x) = 
g(x) for all x € [a,b] and f and g differ at least at one point, then 


b 


b 
[| feoae f gerar, 


a 
Let us mention another mean value theorem. 


9.3.4 Bonnet’s Mean Value Theorem (also called the Second Mean Value Theo- 
rem) Suppose that f and g are two real functions defined on an interval [a, b] such 
that f is continuous and g is monotone. Then there is c € [a, b] such that 


b é b 
/ f(x)g(x) dx = g(at+) | f(x) dx + g(b-) f f(x) dx. 
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Clearly, it suffices to consider the case where g is increasing. Then, the conclusion 
follows by replacing g(x) by g(x) — g(a+) in the next lemma. 


9.3.5 Lemma Suppose that f and g are two real functions defined on an interval 
[a,b] such that f is continuous and g is increasing and positive. Then, there is a 
point c € [a, b] such that 


b 


b 
/ fx)g(x) dx = g(b—) | f (x) dx. 


a 


Proof Let A be the division determined by the points x, = a+ kes, for 
k=0,...,n. Then 


b n—1 Xk+1 
[= / fx)g(x) dx = >» / g(x) f (x) dx = On + Pn, 
a k=0 Xk 
where 
n—-1 Xk n—1 *k+I 
On = > G(x) i f(x) dx and p, = >° / f ()(g(x) — g(xg)) de. 
k=0 XE k=0 x, 
If L is an upper bound for | f| , then 
n=! Xk+1 
lenl <>) | IFC G(&) — g(xe)) dx 

k=0 5, 


n—1 
< LD) eg (Ure, xe) Get — 28) 
k=0 
_ Lb-a) 
~ nN 


(g(b) — gla)) = 0 


as n — oo. By considering the function F(x) = iis Ff (t) dt (which is continuous), 
we get 


Xk+1 


n—1 
on = > g(x) / f(x) dx 
k=0 


Xf 
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n—1 
= D5 96) (Fx) — Fre41)) 
k=0 
n—1 
= >) Fax) (g(r) — gn—1)) + F x0) 9(20) — Fn) gGn-1) 
k=1 
n—1 
= 2 FOR) GOR) — 9@R-1)) + Fo) 90) 


k=1 
since F(b) = 0. Letting 


m= min F(x) and M= max F(x), 
xe[a,b] xe[a,b] 


it is easy to see that o,, lies in between mg(xp_1) and Mg(x,_1) and thus, in between 
mg(b—) and Mg(b—). Since this is the range of the continuous function g(b—) - F, 
there must be a number c, € [a, b] such that g(b—) - F (cy) = oO». The proof ends 
by choosing a limit point c for (cy)n. 


A nice application of this theorem was given by Stark [4]. Let 


2n+1 


5 sin nx 
Dy(x) = > ae = 7 x? 
20 27 sin 5 
k=-—2n-1 


where the last equality is true whenever e'* 4 1. Computing directly tr x Dy (x) dx 
and then evaluating it via Bonnet’s Mean Value Theorem applied to the function 


6) 1 ifx=0 
gx) = (/2 ; 
ate ifx€ 0,7], 


we obtain Euler’s formulas, 


foe) 2 ie) 2 
1 TU 1 A 

> PE. 12 => — and > 72 =—. 

k=1 (2k — 1) 8 k=1 k 6 


Exercises 


1. (Hermite-Hadamard inequality). Let f : [a,b] — R be a continuous convex 
function. 


(a) Prove the following two-sided estimate for the integral mean value of the 
function f, 
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b 
1(=*)< 1 [fears 
= 4 


2 


the inequalities being strict except for the affine functions. 
(b) Prove that the right-hand side inequality can be strengthened as 


b 
1 1 a+b f@+ f) 
saz f fou s | r/ )+ iF 


2 2 


[Hint: (a) Notice that the graph of f is in between the support line passing through 
the point (2, f (442)) and the line passing through the endpoints of the graph. 
ee these two inequalities. For the equality case, use Corollary 9.3.2.] 

. Let f : [a,b] > R bea C!-function. Prove that the sequence of general term 


*) 
has limit (b — a) (f(b) — f(a)) /2, a fact which improves the result of Exercise 


2 in Sect.9.1. 
[Hint. Let x, =a+k(b—a) /n, fork € {0,...,n}. Then 


b n 
[tena pate 
a k=1 


b 
b > n 
/ f(x)dx — —= >" for) 
a k=1 


n Xk 
= | 2 Oe 


kal X—Xz 
Xk 
any faa few) if _— 
k=1 Ck ~ Xk 


b-a b-a , 
=> i 5 ras). 


3. (A proof of the irrationality of 7). The Legendre family of polynomials is 


n 


On) = 5 Se" — 9)" 


forn € N. We have Qo(x) = 1, Q1(x) = 1—2x, Qo(x) = 2— 12x — 12x? and 
so on. 
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(a) If f € C” ((0, 1]), prove that 


1 


1 
/ On(x) f (x) dx =f : x1 =x" d" f (x) dx. 
Nn. 
0 


dx” 
0 


(b) Show that each of the integrals is x* sin(rx) dx represents polynomials of 
degree k; in z, with integer coefficients divided by a 

(c) Prove that i Q,(x) sin(x) dx 4 0 for every index n € N. 

(d) Suppose that z is a rational number, and z = a/b for some a, b € N*. Then 


1 
n 


1 
1 d 
l< a" | QO, (x) sin(x) dx} = af x"(1 — x)” (sinzx) dx 
n! dx” 
0 0 


1 
1 
an f — x" (1 — x)" 2" dx 
n! 


0 


IA 


for all n € N. Here we used the fact that sin(7x) equals either +7” sin(zx) 
or £70” cos(zx). 
Since the maximum of x(1 — x) on [0, 1] is + this leads to 


eos u mon 
~  anl\4 


for all n, which is a contradiction since the right hand side converges to 0. 
Note The first proof of the irrationality of 2 was given by J. Lambert in 1767. 


9.4 Dependence of Parameters 


In this section, we will discuss the problem of continuity, differentiability, and inte- 
grability of some functions defined via integrals. 


9.4.1 Theorem Let f : [a,b] x [c,d] > R, f = f(x, 1), be a function continuous 
with respect to both variables. Then 


b 
F(t) =| f(x, t) dx 


is continuous on [c, d]. 


Proof We will show that in fact F is uniformly continuous. Since f is continuous 
on a compact set, f is also uniformly continuous. See Theorem 6.6.4. Let ¢ > 0. 
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Then, there is 6 > 0 such that |x’ — x”’| < 6 and |t’ — t”| < 6 imply 
FQ) — $@".£)| =e b= a). 


Thus, for such ¢’ and t”, 


b 
|F() — Fi’)| </ Ife.) — fix, t”)| dx < a (b-a) =e. 


The next lemma deals with functions u = u(x,t), of two real variables. For 


such functions, we may consider their partial derivatives va and ou computed as 


derivatives with respect to one variable keeping the other one fixed. More precisely, 


du . U(x, to) — u(xo, fo) 
— (xo, fo) = lim 
Ox XX x — x0 


and 


u(xo, t) — u(xo, to) 
t — to . 


ou ; 
af (xo, to) = i 


For example, in the case of the function u(x, t) = e* sin xt, we have oe = e*(sinxt+ 
t cos xt) and va = xe* cos xt. 

A function which admits both derivatives of first order and they are continuous 
(with respect to both variables) is called a function of class C ‘. 


The derivatives of second order are defined via the formulas: 
Pui da fdu\ ua (au). 
ax2 ax \ax J? arax at \ax]’ 
Pu 9d (du) au a (du 
axat ax \ ar)’ ar2 ar Var J” 
A function which admits all partial derivatives of second order and they are contin- 


uous is called function of class C?. In this case, the mixed partial derivatives are 
necessarily equal: 


au _ du 
arax dxdt’ 


This is explained in any book on several variables calculus. 
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9.4.2 Lemma Let f : [a,b] x [c,d] > R, f = f(x, 1), be a function continuous 
with respect to both variables, differentiable with respect to t and such that te is 
continuous with respect to both variables. Then 


b 
F(t) =) f(x, t) dx 
isac! function on [c, d] and 
F w=] ¥ ap t) dx. 


Proof It suffices to motivate the formula for F’. The continuity of F’ follows from 
the previous theorem. Let fo € [c, d]. Then, if t ¥ fo, 


F (t) — F(t) =| FO, 1) — FR, 0) g 


t — to t — to 


The formula for F’ is now a consequence of Theorem 9.4.1, due to the continuity of 
the function 


gO.) = | [f(x.t) — fle. to)/(t—t0) if t # 19 


¥ (x, tp) ift=f 


with respect to both variables. 


9.4.3 Theorem (Leibniz’ Rule of derivation under integral sign) Let 
f:[a,b]x[e,dJ>R, f= f(x,0), 
be a function as in Lemma 9.4.2 and let g : [c,d] > [a, b] a C! function. Then 


g(t) 
F(t) = / f(x, t) dx 


Q 


is a C! function on [c, d| and 


g(t) 


a 
F'(t)= / LG t)dx + f(~@),t)- 9’). 


a 
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Proof Let to € [c, d]. Then, if t 4 fo, 


F F g(to) 1 g(t) 
t)— F(t t) — t 
()- Fo) _ f fO.)~ FO.) yy | enide 
t — to t— to t — to 
a g(to) 


By Theorem 9.4.1, the first integral in the right-hand side approaches 


gto) 


ae to) dx 


a 


as t —> tg. According to the First Mean Value Theorem for integrals, there is a point 
x, in between y(to) and g(t) such that 


g(t) 
i) = 1 
| poe nar = LO = 9) 
t — to t— to 
g(to) 


FOr, t). 


Therefore 


g(t) 
J feenar = 0) + £0000). 


g(to) 


lim 
t>to0 t — to 


Thus, the derivative of F has the desired form. The continuity of F’ follows from 
Theorem 9.4.1 and the operations with continuous functions. 


See Exercise 5 for an application involving the second-order partial derivatives. 


9.4.4 Fubini’s Theorem Let f : [a,b] x [c,d] ~ R, f = f(,t), be acontin- 
uous function with respect to both variables. Then the functions 


d b 
1a) =f fonar and Ji f foenas 


are continuous on [a, b] and [c, d] respectively and their integrals are equal, that is, 


b d 


d b 
/ | fener ax= [ [fone dt. 


a c 


Proof The continuity follows from Theorem 9.4.1. This allows us to consider the 
functions 
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b & 


Z b 
F@= | [ fone dx and c= | | fenne dt 


a 


for z € [c, d]. By Leibniz’ Rule of derivation under integral sign, 


b 
F@=G6o@= | f (x, z) dx 


for all z € [c, d]. Since F(c) = G(c) = 0, we conclude that F(z) = G(z) for all 
z € [c,d], particularly for z = d. 


Exercises 


1. Compute <. (focs* cos(arxt) dt) . 


2. Consider the function 
20 


I(r) = / log(1 — 2r cos x +r”) dx 
0 


forr € R, |r| 4 1. Prove that: 
(a) 1(0) = 0, I(—r) = I(r) and I(r) = 27 log |r| + J(1/r) forr 4 0. 
(b) I(r) = 0 for |r| < 1. 

3. Letm,n € N, n > 2. Compute 


1 
; x"! log” x dx; 
0 


here the function x”~! log” x is extended by continuity at x = 0. 
4. One can prove that fy" = 1/Va? — b* fora > b > 0. Infer the formula 


dx 
a+bcosx 


ae 
/ dx _ wa 
(a+bcosx)* — (a? = py? : 
0 


5. (d’Alembert’s Formula). Suppose that uo € C?(R, R) and u; € C!(R, R). Prove 
that the function 


xX+at 
uo(x — at) + uo(x + at 1 
u(x,t) = of ) 5) of ) + om / u(t) dt 


x—at 
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is a solution of class C2 on R x R for the one-dimensional wave equation, 


accompanied by the initial conditions 
Ou 
u(x, 0) = ug(x) and ae 0) = u(x) forx ER. 


See Sect. 12.3 for another problem concerning the wave equation. 

6. (Integration of continuous functions over compact rectangles). Using Fubini’s 
theorem, we can define the integral over the rectangle [a, b] x [c, d] of a function 
f : [a,b] x [c,d] — R continuous with respect to both variables x and y as 


b d d 


b 
7 fG dxdy= / Abc apdy \de = / / Fleva)dx ) dy. 


[a.b]x\c.d] a \c c 


(a) Prove thatifa = x9 < x1 <--++ <Xm = bandc=yo < yt <--:<YW=d, 
then 


m—\1n-1 
/ | f(x, y)dxdy = 5° >? i | f(x, y)dxdy. 
[a,b1x [c,d] J=0 k=0 Exo x j4ilx lye uel 


(b) Prove that the previous decomposition formula works for any decomposition 
of the rectangle R = [a, b] x [c, d] into rectangles Ri, Ro, ..., Rp, such that any 
two of them have no interior points in common. 

(c) Let R = Ur Rj; be a decomposition as above. Prove that if for each sub- 
rectangle, at least one of the sides has integer length, then the same is true 
for R. 

[Hint: (c) Look at f(x, y) = e?7!@+¥) 


9.5 The Riemann Integral 


Riemann was the first to give a clear definition of the concept of integral. After 
him, the interest in integral calculus switched from the problem of computation of 
particular integrals to the general theories of integration. 

Riemann’s theory was inspired by the rectangle method of approximating the 
integrals of regular functions. 

If A is a division of the nondegenerate interval [a, b], consisting of the points 


a=xX9 <XxX1 <-:++-<x,=b), 
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then we call the norm of A, the number 
|Al] = max (xp41 — Xp). 
0<k<n—-1 


If A; and Az are two divisions of [a, b], then we say that Ao is finer than A, (or 
A? is a refinement of A;) if Ay C Ao. Obviously, if Az is finer than A, then 
||Ao|| < |{Ay||, the reciprocal not being necessarily true. Notice that for any two 
divisions A; and Ao, there is always a division finer than both (A = Aj U Ao is an 
example). This means that the set of all divisions of an interval is a directed set. 

For a division A = (xx), we define an intermediate A-point system to be any 
family € = (Ex) such that & € [xz, xp41] fork =0,1,...,n—1. 

If A is a division of [a,b], € = (Ei) 9 an intermediate A-point system and 
f : [a,b] — R, we define the Riemann sum of f associated to A and é to be 


n—1 


Sca.e(f) = >of En) cert — x4) 


k=0 


We are interested in the behavior of the Riemann sums as the norm of the divisions 
approaches 0. 


9.5.1 Definition A function f : [a,b] — Ris called Riemann integrable if there is 
a real number J with the following property: for every ¢ > 0, there is 6 > O such 
that for every division A with || A|| < 6 and every choice of an intermediate A point 
system €, we have 


IT — San(Pl <e. 
It is not difficult to see that if such a number J exists, then it is uniquely determined 


by the function and the interval under attention. We call J the Riemann integral of 
f over the interval [a, b]. Usually, it is denoted by one of the symbols 


b b 
[feo and | fae 


It may look like we are abusing the notation used for the integral of regular 
functions, but we will show soon that we have serious reasons to do it. 
As in the case of regular functions, we define 


a b 
[fears f rere 
b a 
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and 


| fooa = 0. 


Remark The definition of Riemann integrability makes sense for complex-valued 
functions (and beyond). A function f : [a,b] — C is Riemann integrable if and 
only if Re f and Im f both have this property. Moreover, 


b b 


b 
J fears | Re fear +i | Im f(x) dx. 


a a 


This follows easily from the fact that 


Re S(a,e)(f) = Sca,e)(Re f) and Im S(a,¢)(f) = Sca,e)(im f). 


As a consequence, the study of complex-valued integrable functions reduces to 
that of real-valued functions. 

The set R({a, b], R), of all real-valued Riemann integrable functions on [a, b] 
is a real linear space and also a linear lattice of functions. See Exercise 7. The set 
R((a, b]) of all complex-valued Riemann integrable functions on [a, b] is a complex 
linear space. 

The Riemann integration defines a linear functional y? on R({a, b], R), which 
obviously satisfies the properties INT1), (INT2), and (INT4). As in the case of 
regular functions, this leads to the conclusion that J . is monotone. The proof of 
property (INT3) within the framework of Riemann integral needs a preparation and 
will be done in Corollary 9.5.9. 


9.5.2 Theorem /f f : [a,b] > R isaregular function, then it is Riemann integrable 
and its integral as a regular function coincides with that in the sense of Definition 


95.1. 


The proof is left as Exercise 1. Next, we will discuss the boundedness property 
of integrable functions. 


9.5.3 Theorem Every Riemann integrable function f : [a,b] > R is bounded. 


Proof Let I be the value of the integral of f and let e > 0. By the definition of the 
integral, there is a division A = (Xk) e=0 such that for every choice of an intermediate 
A-point system €, we have 


IT—e<Saje(f) <I +e. 


Suppose that f is unbounded. Then, there is an index k such that f is unbounded on 
[xz, X~41]. We will consider the case when 
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sup f(z) = 0. 


zelxp Xe41] 
Let € = (Ek )e—9 be a A—point system. Then there is Er € [xk, Xp41] such that 
SEL) > fx) + 2e/ (Xh41 — xp). Let &" be the intermediate A—point system 
made out of & by substituting &/ for &), (and keeping all the other points the same). 


Since 
Saaen(f) <1+e and —Sae(f) <-I +e 


we obtain, by adding the two inequalities, that 
Scaen(f) — Sca,ey(f) < 28. 
In the same time, since except for the kth term the two sums are identical, 
Sca.en(F) — Scajen(f) = FE) — f Ee) eer — XK) 


and so, 


(f (&) — f Er) rt1 — x) < 2e. 


This implies that f(&) < f (Ee) + 2€/ (xp41 — Xx), which is a contradiction. 


The fact that the Riemann integral has some other special properties (heredity, 
additivity, and more) requires knowledge of some criteria of integrability. 

The first problem at the end of this section describes integrability as a process of 
limit of sequences and the convergence of those sequences can be checked (as we 
know) by proving that they are Cauchy sequences. Instead of translating this into the 
terminology of integrals, we prefer to use a method developed by Darboux which 
eliminates the intermediate point systems. 

Let f : [a, b] > R be a bounded function. For a division A = (x,)7_9 of [a, 5], 
let 


m,(A) = min oe and M;(A)= max f(x). 


XE X41 xe Xp41] 
We define the ower Darboux sum 


n—1 


LDSa(f) = So mA) e+ — Xk), 


k=0 
and the upper Darboux sum 


n—1 


UDSa(f) = >) Me(A)K+1 — x4). 
k=0 


It is clear that 
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LDSa(f) < S(a.g)(f) < UDSa(f) 
for any possible choice of the intermediate A—point system. It is also clear that 


LDSa(f) = inf Sa.ay(F) and UDSAC P= BB it): 


Moreover, we have the following: 


9.5.4 Lemma /f A is finer than Aj, then 
LDSa,(f) < LDSa,(f) < UDSa,(f) < UDSa,(/). 
9.5.5 Corollary For any two divisions A, and Ao, 
LDSa,(f) < UDSa,(f). 


that is, every lower Darboux sum does not exceed any upper Darboux sum. 


Proof 1n fact, 


LDSa,(f) < LDSa,ua,(f) < UDSajua,(f) < UDSa,(f). 


This situation leads us to the definition of 


f)= we LDSa(f), 


called the lower Darboux integral of f and 


T(f)= inf UDSa(f), 


called the upper Darboux integral of f. By Corollary 9.5.5, 1 < T. 
It turns out that the Riemann integrability of f is equivalent to the existence of a 
real number J such that J = J = TI. 


9.5.6 Theorem (Darboux Criterion of Riemann Integrability) A real-valued func- 
tion f defined on an interval [a, b] is Riemann integrable if and only if for every 
€ > 0, there is 6 > 0 such that 


UDSa(f) — LDSa(f) < € 


for all divisions A with |\|A|| < 6. 


Proof First we show that the condition is necessary. Put J = [ bs Ff (x) dx and choose 
é > 0. By the definition of integrability, there is 6 > 0 such that 
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€ 


1-5 <Saa(f) <I+5 


for every division A with ||A|| < 6 and every intermediate A—point system. There- 
fore 


E E 
[=2 2Ups eras 
5 A(f) < 7S 
and 
(22 21S (Frere 
— , << rer 
; eats 7 
whence 
E E 
UDSAG) ~LDSa) SU y= 3 = 


for every division A with ||A|| < 6. 
Now we will show that the condition is sufficient as well. Since 


0<1(f)—1(f) < UDSa(f) — LDSa(f) < ¢ 
for every ¢ > 0, we get that 1(f) = I(f). Let J be the common value of the two 
Darboux integrals. We will show that f is Riemann integrable and f? f(x)dx =T. 
We know that 
LDSa(f) <1 < UDSa(f) and LDSa(f) < Sia.) < UDSa(/). 
Then 


IT — S(a.e)(f)| < UDSa(f) — LDSa(f) < € 


when ||A|| < 6. This completes the proof. 
Next we list some consequences of the Darboux Criterion. 


9.5.7 Theorem (Inheritance of integrability) /f f is a Riemann integrable function 
on [a, b] and [c, d] C [a, b], then f is Riemann integrable on {c, d]. 


9.5.8 Theorem (Additivity of integral) If f is Riemann integrable on [a, c] and on 
[c, b], then f is Riemann integrable on [a, b] and 


b c b 
| feoa= [rena | forar. 
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9.5.9 Corollary If f : [a,b] — Ris Riemann integrable, then the function F (x) = 
Ae Ff (t) dt is Lipschitz on [a, b]. 


For the next theorem, we need the concept of Jordan null set. A set X C R is 
called a Jordan null set if for every ¢ > 0, there is a finite family of compact intervals 
(Lax, bg ])i_, such that X C UP_ az, bp] and Sh (bk —ax) < € (in other words, if 
X can be covered by a finite number of intervals of arbitrary small total length). If in 
the previous definition, instead of compact intervals we use bounded open intervals, 
we define the same concept. Indeed, if the property is true for a finite family of 
bounded open intervals, then it will also be true for their closures. Conversely, let 
(Lax, bg ])_, be a family of compact intervals such that X C Ut_,[a,, bg] and 

r=) (Ok — ap) = L < €. Let 


e-—L 


/ e-—L / 
ol aa and bj, = by + ah 


Then, X C Ut_, (aj, bj.) and 


n n 
2(e—L 2(e—L 
Yessy Gow pe ae 
3 3 
k=1 k=1 
Most of the properties of Jordan null sets are straightforward. Every finite set is 

a Jordan null set and every Jordan null set is bounded. Thus, the closure of a Jordan 
null set is a compact Jordan null set. 


9.5.10 Theorem (Property of stability) Let f,g : [a,b] — R be two bounded 
functions and let X be a Jordan null subset of [a,b] such that f(x) = g(x) for 
all x € [a,b] \ X. If f is Riemann integrable, then g is Riemann integrable and 


ff? g(a)dx = f? fade. 


In other words, when dealing with bounded functions, changes on Jordan null sets 
influence neither the character of integrability nor the value of the integral. 


Proof Notice first thatif A = (xk) p—0 is a division of [a, b] and ([xz, xe+1Dixp are 
the subintervals of A intersecting an interval [c, d] C [a, b], then 


q 
>) (eee — xe) < @—c) +2]/A]. 
k=p 


Put M = supyeta,5) | £8)! + SUP eta,o] [9@)| + 1 and fix arbitrarily « > 0. By 
our hypotheses, there is a cover ([c;, d iWiey of X of total length less than ae Let 
A be a division of [a, b] such that ||A|| < ae When we compute 


|UDSa(f) — UDSa(g)| 
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for all subintervals that are not intersecting X, f and g coincide and so, the corre- 
sponding terms in the two sums cancel. Therefore 


>( op f= me woo Jone —9 
€ 


|UDSa(f) — UDSa(g)| = 


kh \e eer xe+1] xe[xp Xe] 
sup f(x)— sup —_g(x)| (xny1 — x8) 
XE[Xp X41] xeLxp xh+t] 
=2 sup |f()|+ sup |g(x)| Jrne1 — Xn) 
XE[XK X41] xeLxp xhqt] 


‘ce, sup fC) + sup 7) ee 


-( sup [f(x)|-+ sup ts) Doe a0. 
xe[a,b] xe[a,b] i 


where the sum is computed only over all subintervals intersecting X . This is definitely 
less than the same sum computed over all subintervals [x;, x,41] intersecting the 
cover of X and in this case, 


é€ é 
Dre — xh) < Di re ae 
4 J 


Thus, 
€ E 
a < UDSa(f) — UDSa(g) < i 
A similar computation gives us 
€ E 
ee LDSa(f) — LDSa(g) < ar 


Choose 6 > 0 such that ||A|| < 6 implies that UDS,(f) — LDSa(f) < : Then, 
for ||A|| < min{é, 747}: 


UDS,(g) — LDSa(g) < (UDSa(f) + ) tose ) 
= UDSa(f) — LDSa(f) + = <e 


and the Riemann integrability of g follows from the Darboux Criterion 9.5.6. Let 
b ; 
I=) f(x) dx. Since 
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|UDSa(f) — Z| < UDSa(f) — LDSa(f) < =. 


we infer that 


Taking the infimum over ¢ > 0, we conclude that [ y g(x)dx = I. 


é é 
[UDS. (g) — F] = UDSa(g) — UDSa(f)| + [UDSa(f) — I] 5 a +5 <8. 


Exercises 


1. 


Prove Theorem 9.5.2, that is, the fact that every regular function is Riemann 
integrable and its integral as a regular function coincides with that in the sense 
of Definition 9.5.1. 


. (Riemann integral via sequences). Prove that a function f : [a,b] > R is 


Riemann integrable if and only if there is a number J such that for any sequence 
(An)n of divisions of [a, b] with lim ||A,|| = 0 and every sequence (&,), of 
n—- OOo 


intermediate A,,—point systems, we have 


nue S(Anén) SF) =f, 


. Compute 


fine oe eee ee 
im — +++ ——_ ]. 
noo sinn \ cos? cos? + cos? 34—2 

n 3n 3n 


. Asimple example of a function which is not Riemann integrable is the everywhere 


discontinuous function 


0 
j=, ifx € [0,1] Q@. 
Prove that the same is true for the function 


_ fsinx  ifx €[0,7/2]\Q 
g)= (= ifx €[0,7/2)NQ. 


. Prove the properties of inheritance and additivity of Riemann integral, that is, 


Theorems 9.5.7 and 9.5.8. 


. Prove that Cantor’s set, although uncountable, is a Jordan null set. 
. (a) Infer from the Darboux Criterion of integrability that the absolute value of 


any Riemann integrable function is integrable too. 

(b) Conclude from (a) that R({a, b], R) is a linear lattice. 

(c) Extend the result in (a) to the case of complex-valued functions and prove 
that 
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b b 
/ F(x)dx| < ‘| If (x)] dx 


<(b—a)- sup |f@)| 


xe[a,b] 


for every f € R({a, b]). 
(d) Suppose that f € C!({a, b]). Show that 


b f Gu 
[ire ce J fea +25. sup if 


xe[a,b] 


[Hint: For (d), use the identity 


1 Xx 1 
or / fart / if'(dt + i @— Df at] 
0 x 


0 


8. (The generalized derivative of Lanczos [5]). Let f : [a,b] ~ R be a Riemann 
integrable function and let x € (a, b). For h > 0 small enough, we define 


h 


/ tf(x +t) dt. 


—h 


3 
Dy f (x) = he 
(a If fec 3([a, b]), then by Taylor’s formula, 

F@+D=f@+ PG) t+ f'@-P/24 f"OO)- 7/6. 
Conclude from here that 
Daf (x) = f'(e) + OW). 


(b) Prove that if f : [a,b] — R is a Riemann integrable function and x € (a, b) 
is a point where /f has finite one-sided derivatives, then 


1 / / 
lim Di f@) = 5 [ff @) + F-0)]. 


9. (Riemann’s Criterion of integrability). Prove that a function f : [a,b] > Ris 
Riemann integrable if and only if it is bounded and for every ¢ > 0, there exists 
a division A = (xx);_9 of [a, b] such that 
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n—-1 


D5 of (xe, E41 D R41 — Xk)  &. 
k=0 


10. (Riemann’s Pathological Function). Put (x) = 0 if {x} = 0.5 and for the rest, 
(x) = x—n, where vis the integer nearest to x. Riemann considered the function 


(x) | (2x) | (3x) 
fM= +a tp 


+---  forx € [0, 1], 


m 


ome where m 


and proved that f is integrable, yet discontinuous at all points x = 
and n are relatively prime integers. Give an argument. 


9.6 Lebesgue’s Criterion of Integrability 


A closer look at the Riemann integral reveals that it is a rather restrictive concept. 
First, it applies only to bounded functions. Second, as Lebesgue observed, it applies 
only to functions whose set of discontinuities is negligible in some sense. This is 
made precise by the following definition: 


9.6.1 Definition A subset X of R is called a Lebesgue null set (or a set of Lebesgue 
measure QO) if for every ¢ > 0, there is a countable family of bounded intervals [, 
such that X C U,, In and 7, Un) < ©. 


As with the Jordan null sets, there is no difference if the intervals considered are 
open or closed. 

Every countable subset X = {xo, x1, x2, ...} of R is a Lebesgue null set. Indeed, 
for every € > 0, the intervals 


Tn = (Xn — 7 ae Xn + ef) 


constitute a cover of X of total length not exceeding ¢. 

This line of reasoning can be easily adapted to show that a countable union of 
Lebesgue null sets is still a Lebesgue null set. It is obvious that a subset of a Lebesgue 
null set is itself a Lebesgue null set. 

A Jordan null set is a Lebesgue null set but the reciprocal is not true without some 
extra hypothesis. N is an example of a Lebesgue null set which is not a Jordan null 
set. However, every compact Lebesgue null set is a Jordan null set. 

In Exercise 2, we call the attention on the fact that Lebesgue negligibility and 
topological negligibility are distinct notions. 


9.6.2 Definition A property P(x) depending on areal variable x is called true almost 
everywhere (a.e. for short) if it is true for all possible values of the variable except 
for a Lebesgue null set. 
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Thus, we can talk about a.e. continuous functions, about sequences of functions 
converging a.e., and so on. 


9.6.3 Theorem (Lebesgue’s Criterion of Riemann integrability) A function f : 
[a,b] — R is Riemann integrable if and only if it is bounded and continuous 
a.e. 


The proof of Lebesgue’s Criterion needs some preparation and will be presented 
in the next section. 

This Criterion gives us the possibility of obtaining some simple proofs to facts, 
otherwise very difficult to justify through Darboux Criterion. For example, it shows 
that the function r ig é 

ifx = 
FOO=) sn) ifx € ©, 1] 


is Riemann integrable. Notice that this is not a regular function. The same criterion 
can be used to prove that if f and g are Riemann integrable, then fg is Riemann 
integrable. 


Exercises 


1. (a) Prove that Cantor’s triadic set A is a Lebesgue null set. 
(b) Infer that x is Riemann integrable on [0, 1], yet its set of discontinuities is 
uncountable. Then compute its integral. 

2. (The lack of correspondence between Lebesgue null sets and first Baire category 
sets). 
(a) Let n bea strictly positive integer. To each rational number p/g, with p,g ¢ N 
and 0 < p <q, we attach the interval 


( p 1 . p fa 1 ) 

q nq24 q_ nq24 
and denote by S, the union of all such intervals. Prove that S,, is an open dense 
subset of [0, 1], covered by countably many open intervals, the sum of whose 
lengths does not exceed 2/n. 
(b) Let S be the intersection of all the sets S, (S is precisely the set of Liouville 
numbers). According to (a), S is a set of the second Baire category. Prove that S$ 
is an uncountable Lebesgue null set. 
(c) Prove that the complement of S in [0, 1] is a nowhere dense set but not a 
Lebesgue null set. 

3. Suppose that f : [a,b] — [c,d] is a Riemann integrable function and g : 
[c, d] > R is acontinuous function. Prove that g o f is Riemann integrable. 

4. (a) (J.K. Thomae). Prove that the ruler function R : [0, 1] > R, given by 


1 ifx =0 
R(x)=41/q ifx=p/q, p,q €N*, p prime tog 
0) if x is irrational, 


324 9 The Riemann Integral 


is Riemann integrable and its integral is 0 (yet f is discontinuous at infinitely 
many points in every arbitrarily small subinterval of [0, 1]). 
(b) Using #, construct an example to show that composition of Riemann inte- 
grable functions may not be a Riemann integrable function. 
[Hint: Consider a Riemann integrable function f : [0,1] — [0, 1] such that 
(f oR) (x) = 0 when x is rational and (f o R) (x) = 1 when x is irrational. ] 
5. (J.R. Munkres). If g : R — R has the property that g o f is Riemann integrable 
on [a, b] whenever f is continuous on [a, b], then g is continuous. 
6. Prove that a. A i sin(+) dt = 0, no matter how the integrand is defined at 
t=0. 
[Hint: Use Bonnet’s Mean Value Theorem.] 


9.7 The Proof of Lebesgue’s Criterion 


Suppose that f is a real function defined on a subset A of R anda € A. Forr > 0, 
we define the r-oscillation of f at a by 


wr(a;r) = sup f(x) -— FW) 


x,yeAN(a—r,at+r) 
and the oscillation of f at a by 
= inf ory = Ih “7T). 
ow f(a) a wf (a;r) a wf (a;r) 


It is easy to see that f is continuous at a if and only if w f(a) = 0. 


9.7.1 Lemma /f A is a closed set and f : A — R is a bounded function, then for 
every € > O, the set 


Ee = {xix € A, w(x) = €} 
is closed. 


Proof Let (xn)n be a sequence in E, such that x, — x. Suppose that x ¢ E,. Then 
w(x) < e and thus, there is 6 > O such that w f(x, 5) < e€. Since x, > x and 
(x — 6, x + 6) is an open set, there is an index n and a positive number r such that 


(Xn —1,Xn +r) C (x —6,x4+ 5). 


Thus, wf(%) < @f(%n,r) < we(x,6) < € which contradicts the fact that 
Xn € Eg. 
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9.7.2 Generalized Heine’s Theorem Let A be a compact subset of R and let 
f :A— R be a bounded function. If wp > 0 and w(x) < wo for all x € A, then 
for each ¢ > 0, there is 5 > 0 such that 


w(x; 5) < @o +e wheneverx € A. 
Proof Suppose not. Then, there is ¢ > 0 and a sequence (x,,), of elements of A such 


that wf (Xn, 1) > wo + € for every n. Since A is a compact set, the sequence (x,) 
has a convergent subsequence (x;, )n. Let x = lim x,,.Ifr > 0, there is N such 
noo 


that (xj, E Xk, + Ee) Cc (x —r,x +r) forall n > N. Therefore, 


1 
Mp +e < @f(XE,, 7, < wf (x,r) 
nN 


for every r > 0, which contradicts the hypotheses. 


9.7.3 Paul du Bois-Raymond Criterion A function f : [a,b] — R is Riemann 
integrable if and only if it is bounded and for each ¢ > 0, the set 


E,={x: x € [a,b], wf(x) = &} 


is a Jordan null set. 


Proof We will show first that the condition is necessary. 

If f is Riemann integrable, then by Theorem 9.5.3, f is a bounded function. 
Suppose that there is ¢ > 0 such that E, is not a Jordan null set. In this case, there 
is some a@ > O such that any finite cover of E, with open intervals has total length 
greater than @ (the existence of a finite cover follows from the fact that E, is a 
compact set). Let A be a division of [a, b]. If S is a subinterval of A which intersects 
E,, then 


(sup f — inf f)e(S) > e|5| 
S S 


and since the sum of the lengths of intervals intersecting E, is at least a, we infer 
that 


UDSa(f) — LDSa(f) 2 ae, 


which contradicts Darboux Criterion. 
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We will show now that the condition is sufficient. Let ¢ > 0. Choose ¢’ > O and 
e” > 0 such that 


€ € 
e'(sup f — inf <-— and e"(b—a) <-. 
Se f a f) 5 ( ) 


Since FE” is a Jordan null set, there is {A,, A2,..., An}, a family of open intervals 
covering E,” and of total length less than e’. The set K = [a, b]\\U7_, Ax is compact. 
By Theorem 9.7.2, there is 6 > 0 such that for every interval J included in K, with 
£(1) < 5, we have 
sup f —inf f <e”. 
I I 


As in the proof of Theorem 9.5.10, there is 5’ > 0 such that for every division A with 
||A|| < 8’, the subintervals of A intersecting at least one of the sets Aj, A2,..., An 
have total length less than «. If ||A|| < min{6, 5’}, 81, S2,..., Sj are the subinter- 
vals of A intersecting at least one of Aj, A2,..., An, and P is the set of all other 
subintervals of A, then 


UDSa(f) — LDSa(f) = 2 (sup f —inf fye(s) + > (sup f — inf fe(Si) 
a 


SeP 
m 
<e" D7 &(S) + (sup f — inf f) >) (Sx) 
SeP [ab] lel 
<e"(b—a)+e'(sup f — inf f) <e. 
[a,b] [a,b] 


According to Darboux Criterion of integrability, f is a Riemann integrable 
function. 


Proof of Lebesgue’s Criterion of Riemann integrability. We will show first that the 
condition is necessary. If f is Riemann integrable, then by Theorem 9.5.3, f is a 
bounded function. The set of all discontinuity points of f is 


00 
{x € [a,b] : wf(x) > O} = U_& € [a,b] : wp(x) > =}. 


By Paul du Bois-Raymond Criterion, this is a countable union of Jordan null sets and 
so, is a Lebesgue null set. Next, we will prove that the condition is also sufficient. 
Let e > 0. By Lemma 9.7.1, E, is a compact set. Since E, is a subset of the set 
of all discontinuity points of f, E, is a Lebesgue null set. Being compact, it is also 
a Jordan null set. By the Paul du Bois-Raymond Criterion, we conclude that f is a 
Riemann integrable function. 
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9.8 Weyl’s Ergodic Theorem 


In what follows, we will discuss a remarkable property of the function 
T : [0,27] > [0,27], T(x)=x+a_ mod2z, 


where a is a fixed real number such that a/m ¢ Q. More precisely, we will show 
that the time average of any Riemann integrable function f along the iteraties of this 
function, 


N 
a, ‘ 
Pau NW 2 Fir"); 


(ee) 


equals the space average of the function f/f, 


1 20 
A / f (pdt. 
0 


The starting point is the following result originating in analytic number theory. 
See Exercise 3. 


9.8.1 Theorem (Weyl’s Ergodic Theorem) Let a be a real number such that 
a/m isnotinQ. Then 


N>oco 


N 2n 

a oe 1 

lim 2 f(x + ke) = = / f(t) dt, 
_ 0 


for every continuous and periodic function f : R > C, of period 21, and every 
xeR. 


Proof Let x € R. Using the summation of geometric progression, we infer that 


N 20 
1 : 1 . 
lim — » ei Watka) ae / e nt dt 
N->oo N 7 20 
= 0 


for every n € Z. By linearity, this formula extends to every trigonometric polyno- 
mial P : 


N 20 


vl 1 
lim 5 >. Por tha) = =— f Paver. 


N->0oo 
k=1 0 
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Using a density argument, we will show that it actually works for all continuous 
periodic function f : R > C, of period 2z. 

Let ¢ > 0. By Theorem 8.8.3, there is a trigonometric polynomial P,(x) = 
in Ce!” such that sup {| f (x) — Pe(x)| : x € R} < e. Therefore, the absolute 


value of se (a f(t) dt — + ae f(x + ka) is bounded above by 


JU 


lf 1 
a / | f(t) — Pe(x)| dt + rc Pa [f(x + ka) — P.(x + ke)| 


i N 
1 1 


pe k=1 


from where the result follows. 


Theorem 9.8.1 can be easily extended to the case of continuous and periodic functions 
f :R- C, of period 27, provided that a/T ¢ Q. 

Every continuous and periodic function f : R — C, of period 1, is the extension 
by periodicity of a continuous function g : [0,1] — C such that g(0) = g(1). 
Therefore, Weyl’s Ergodic Theorem can be reformulated as follows: Suppose that 
x € Randa e€eR\Q. Then 


1 
 Jboee 
Jim = Fee tee fi Fat (9.6) 
— 0 


for every continuous function f : [0,1]— C such that f(0) = f(1). Here {-} 
represents the fractional part. 

Itis worth noticing that the formula (9.6) can be extended to all Riemann integrable 
functions f : [0,1] > C. 

The key remark is that for any subinterval A of [0, 1] and for any ¢ > 0, there are 
two continuous functions f, g : [0, 1] — C with the following properties: 


(a) f(O) = f(1) and g(0) = g(1). 
(b) f< xa Sg. 
(©) fo@—f)dx <e. 


Then 


N 


1 — 1 
W Dy Sila thal) <= D0 xalle that) << DE gtx + kan), 


k=1 k=1 k=1 
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and taking into account the inequalities 


1 1 


1 1 
[ foes [ xaars f oars | Fat +e 
0 0 0 0 


we infer that (9.6) holds for characteristic functions of subintervals of [0, 1]. This 
conclusion extends by linearity to all step functions and thus, to all Riemann inte- 
grable functions (due to Darboux Criterion of integrability). 

As an application of this result, we will determine the frequency of 7 showing up 
as the first digit of a power of 2: 


1, 2, 4, 8, 1, 3, 6, 1,.... 
2” starts with a 7 if there is some natural number & such that 
T1022? 2810", 


Therefore logig7 +k < nlogj,)2 < log; 98 +k and thus {n log;, 2} belongs to 
[log 7, logy, 8). Since log), 2 ¢ Q, the frequency of 7 is 


N 

1 

lim — > X[log 9 7,log 19 8) ({x =e k logy 2}) 
k=1 


1 
= = 
0 


Analogously, the frequency of 8 as the first digit of a power of 2 is logy 3 = 
0.051 152..., from where we conclude that there are more powers of 2 starting with 
digit 7 than with digit 8. This may appear very surprising since the first encounter 
with 7 is for 2*° = 70 368 744 177 664. 


Exercises 


1. Let A be aclosed subset of [0, 1] whichis not the whole interval. Find a continuous 
function f : [0,1] — [0, 1] such that f(0) = f(1) = 0, fla = O and f is not 
identical 0. 

[Hint: See Exercise 7, Sect.6.1.] 

2. Leta € R\Q. Prove, using the previous exercise and Weyl’s Ergodic Theorem 

that the sequence ({na}),, is dense in [0, 1]. 
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3. A sequence (xy), of real numbers is said to be uniformly distributed mod | if for 
every a, b withO <a <b < 1, we have 


. #(k:0<k<n-—1and {xz} € [a, b]}) 
lim 


n—>0o n 


=b-—-a. 


Here {x;,} means the fractional part of x;. Prove that for every a € R\Q, the 
sequence ({na}), is uniformly distributed mod 1. 

4. Give an example of sequence which is dense in [0, 1] but not uniformly distributed 
mod 1. 

5. Let f : R— C be a continuous periodic function, of period T > 0, and let 
g : [0, 1] > C be a Riemann integrable function. Prove that: 
(a) Ifx € Rwithx/T ¢ Q, and (ay), is a real sequence, then 


T 1 
ie k 1 
He, Dat era (=) = 7 [foe [oma 
= 0 0 


(b) Ifx = ae with p € Z, q € N* and gcd (p, q) = 1, then 


=| 1 
a, he k ie kT 
lim — >" fkx)g (=) = ~>t( ) -f gear. 
n>oo n n q q 
k=1 k=0 0 
[Hint: Consider first the case where all a, are zero and g = 1. This case can be 
settled via Theorem 9.8.1.] 
6. From Exercise 5, we infer that 


_ 2 l< 1 
ae Ne and ee 


Combining this fact with Theorem 2.8.1, conclude that sinn — 0 in density. 
7. Prove the continuous form of Weyl’s Ergodic Theorem: Jf f : R— C isa 
continuous periodic function of period 21, and a is a real number such that 


a/m €¢ Q, then 


Too 


T 20 
lim 7 | fetana =< f roar 
(0) 0 


for everyx €R. 
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9.9 Notes and Remarks 


The need (imposed by mathematical physics) to consider integrals of highly discon- 
tinuous functions led Bernhard Riemann [6] in 1854 to his notion of integral. Exercise 
4 in Sect. 9.6 shows an example of a function which is discontinuous infinitely many 
times in every arbitrarily small interval, yet integrable on every compact interval. 

The upper and lower Darboux integral sums appeared also in Riemann’s approach. 
In 1875, Jean Gaston Darboux reformulated Riemann’s criterion of integrability (see 
Exercise 9 at the end of Sect.9.5) in the form as presented in Theorem 9.5.6. 

Many applications of the Riemann integral come through certain integral in- 
equalities such as Young’s inequality, Hermite-Hadamard inequality and so on. The 
precision in the Hermite-Hadamard inequality can be estimated via two classical 
inequalities that work in the Lipschitz function framework. 

Suppose that f : [a,b] — R is a Lipschitz function with the Lipschitz constant 
Il flltip = M > 0. Then, the left Hermite-Hadamard inequality can be estimated by 
the inequality of Ostrowski, 


, b 
-— | dt| < ae x- 5 M(b —a) 
f@)-;— | fOal <|7+(5— a), 


while the right Hermite-Hadamard inequality can be estimated by the inequality of 
Iyengar, 


5 (f(b) — f(a)’. 


b 
f(a + fi) 1 [ toa 240-2 1 
2 b—a 4 4M (b—a 


See Niculescu and Persson [7] for details. 

Extensions of the inequalities of Hermite-Hadamard and Young can be found in 
the papers of Niculescu [8] and Corina Flavia Mitroi and Niculescu [9]. 

The estimation provided by Exercise 2, Sect.9.1, for the trapezoid method can 
be extended for arbitrary finite family of points. Of considerable interest in nu- 
merical analysis is the Koksma-Hlawka inequality, which works for all functions 
f : [a, b] ~ R which can be represented as differences of increasing functions: 


: N 
1 1 
paa | far y DY fo < Va(f)- D* Ga, ...42N), 


where V7 J) represents the variation of f (see Appendix B, Sect. B3) and 


#{k: x, € [a,at+e)} 


D* (x1,...,Xy) = sup 
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is the so-called star-discrepancy of the set {x1,..., x}. Applications can be found 
in the book of Harald Niederreiter [10]. 

Weyl’s Ergodic Theorem is due to Hermann Wey] [11], who was interested in the 
problem of uniform distribution mod 1. A comprehensive survey on this problem can 
be found in the aforementioned book of Kuipers and Niederreiter [12]. Among the 
many results proved in that book, we mention here the following two: 

(H. Weyl) If P is a polynomial with at least one irrational coefficient (other than 
the constant term), then the sequence P(n) is uniformly distributed modulo 1. 

(I. M. Vinogradov) The sequence of all multiples of an irrational number aw by 
successive prime numbers, 


2a, 3a, 5a, 7a, lla,... 


is equidistributed modulo 1. 

The problem concerning the frequency of 7 showing up as the first digit of a 
power of 2 is an illustration of Benford’s Law (also called the First-Digit Law), that 
refers to the frequency distribution of digits in many (but not all) real-life sources of 
data. More precisely, a set of numbers is said to be Benford distributed if the leading 
digit d (d € {1, 2,...,9}) occurs with probability P(d) = logyg a. Thus, in this 
distribution, the number | occurs as the leading digit about 30 % of the time, while 
larger numbers occur in that position less frequently: 9 as the first digit appears less 
than 5 % of the time. Among the numerical sequences known to obey Benford’s Law, 
we mention here the Fibonacci numbers, the factorials, and the powers of any natural 
number N provided that log;y N ¢ Q. A nice introduction to the mathematics behind 
Benford’s Law is offered by the paper of Persi Diaconis [13]. 

The Fundamental Theorem of Calculus can be easily extended to the framework 
of functions having a parametric derivative. A function F : [a,b] — C has a 
parametric derivative f if there is a strictly increasing differentiable function @ 
from some interval [@, 6] onto [a, b], such that F o g has an ordinary derivative on 
a, 6] with 


(Fog) (1) = f(g@)¢'@). 


When ¢'(t) 4 0, then F’(x) = f(x) is the ordinary derivative at x = y(t). Thus, 
for y(t) = t, the parametric derivative coincides with the ordinary derivative. On the 
other hand, there are nondifferentiable functions which have a parametric derivative. 
For example, the function F(x) = |x|, for x € [—1, 1], is not differentiable at 
x = 0. However, a parametric derivative of F is 


-1 if x € [—1, 0) 
f(x) = j arbitrary ifx=0 
1 if x € (0, 1], 
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corresponding to g(t) = t?, t € [—1, 1]. A more elaborate example is the Cantor- 
Lebesgue singular function. See Appendix B, Sect. B2, Exercise 4 (and also [14]). 

The most general form of Fundamental Theorem of Calculus works within the 
framework of generalized Riemann integral. See Theorem 9.9.4. This integral was 
initiated around 1960 by Jaroslav Kurzweil and Ralph Henstock. 

In what follows, we give a glimpse of this theory in the case of functions defined 
on compact intervals. A complete exposition of the generalized Riemann integral 
(including the case of functions defined on noncompact intervals) can be found in 
the books of Bartle [15], Gordon [16] and Peng Yee Lee and Rudolph Vyborny [17]. 
In particular, Bartle’s book has a good wealth of examples and counterexamples. We 
recommend also the introductory paper of Bartle [18]. 

A tagged division of the interval [a, b] is an ordered finite set 


A = (x0, X1,--+5Xn3 &0, &1,---, €n-1) 


where a = x9 < x1 < +--+ < xX, = band & € [xx, x441]. In this case &; works as a 
tag for the interval [x;,, x41]. 

A function 6 : [a,b] — (0, 00) will be called a gauge function. If 6 is such a 
function, we say that a tagged division A is 6-fine if 


xp, Xe41] C (&% — 6 (EK), Ee + (EK) 


for every k. 

The proof of existence of 5-fine divisions for an arbitrary interval [a, b] is based 
on the obvious fact that if A, is a 6-fine division of [a, c] and A? is a 6-fine division 
of [c, b], then A; U Ao is a 6-fine division of [a, b]. 


9.9.1. Cousin’s Principle (see [19]) For every 6, a gauge function on [a, b], there 
is a 6-fine tagged division of [a, b]. 


Proof Suppose not. Then, the same is true for at least one of the intervals 


b b 
a, aaa and aa ,b 
2 2 
Let Jo = [a, b] and let 1; = [a,, b1] be one of the above subintervals which does not 
have a 6-fine division. Repeating the procedure, we obtain a decreasing sequence of 
intervals J, = [ay, by], none of which have a 6 fine division. Clearly, €([,) — 0. By 


the Nested Interval Lemma, the intersection of all these intervals consists of exactly 
one point, say €. Since 5(&) > 0, there is n such that 


In C (§ — 66), & + 6()) 


and so, A = (ay, bn; €) is a 6-fine division of J,, which is a contradiction. 
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For a function f : [a,b] > R and 


A = (x0, %1, +++, Xn3 €0, 61, +--+, €n-1); 
a tagged division, we define the integral sum 


n-1 


Sa(f) = >. FER) @e41 — xR). 


k=0 


9.9.2 Definition A function f : [a,b] — Ris said to be (H K )-integrable (that is, 
Henstock-Kurzweil integrable) if there is a number J such that for every e > 0, there 
is a gauge function 4, of [a, b] such that 


ISa(f) —I| <é 


for every 6,-fine division A of [a, b]. 


It is easy to see that when this is the case, the number J is unique. The number J 
will be called the (HK)-integral of f over [a, b] and denoted by 


b 
ck) | fosyae. 


The above definition is completed by 


a b 
ak) | foyas =0 and ak) | f(x)dx = =k) | fosyae. 
b 


The Riemann integral corresponds to the case where only constant gauge functions 
are allowed. 


9.9.3 Example The function 


known as a nonintegrable function in the sense of Riemann, is (HK)-integrable and 
its integral is 0. 


In fact, let ro, 71, r2, ... be an enumeration of [0, 1] M Q and let ¢ > 0. We define 
the gauge function 


1 if x €[0,1]\Q 
=| or 
2"~e if x €[0,1]NQ, x =rg. 
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For every 6-fine division A of [a, b], there are at most two subintervals having 
ry as an index and their length does not exceed 2-'~*¢. Therefore, in S, ( J) the 
contribution of the summands corresponding to the interval having r;, as index does 
not exceed 2—"~! ¢, Since the summands corresponding to irrational index points are 
0, we get 


CO 


0< Saif) < >)2 © tee. 


k=0 


Thus, f is (HK)-integrable and (HK) a} s f(x) dx = 0. 

More generally, if f : [a,b] — R is a function such that f = 0 except for a 
Lebesgue null set, then f is (HK)-integrable and (HK) i f(x) dx = 0. 

We end this chapter by noticing the variant of Fundamental Theorem of Calcu- 
lus in the case of Riemann generalized integral. For a more general result, see the 
monograph of Bartle [15], Sect.5, Theorem 5.12. 


9.9.4 Fundamental Theorem of Calculus (The version for (HK)-integral) Let 
F : [a,b] — C be a continuous function, differentiable on [a, b] \ X, where X is 
a countable set. Let g : [a,b] — C be a function such that F'(x) = g(x) for all 
x € [a,b] \ X. Then g is (HK)-integrable on [a, b] and 


b 
ak) [ g(xyas = F(b) — F(a). 


References 


1. Bourbaki, N.: Fonctions d’une variable r éelle. Springer, Berlin (2007) 

2. Ritt, J.F.: Integration in Finite Terms: Liouville’s Theory of Elementary Methods. Columbia 
University Press, New York (1948) 

3. Adamchik, V., Wagon, S.: A simple formula for z. Am. Math. Mon. 104, 852-855 (1997) 

4. Stark, E.: Application of a mean value theorem for integrals to series summation. Am. Math. 
Mon. 85, 481-483 (1978) 

5. Lanczos, C.: Applied Analysis. Prentice-Hall, Englewood Cliffs (1956) 

6. Riemann, B.: Ueber die Darstellbarkeit einer Funktion durch eine trigonometrische Reihe. 
Read in 1854, published in 1867. Republished in Riemann’s Gesammelte Math. Werke, 1892, 
pp. 227-271. Reprint, Dover Publications Inc, New York (1953) 

7. Niculescu, C.P., Persson, L.-E.: Convex Functions and Their Applications. A Contemporary 
Approach. CMS Books in Mathematics. Springer, New York (2006) 

8. Niculescu, C.P.: The Hermite-Hadamard inequality for log-convex functions. Nonlinear Anal. 
75, 662-669 (2012) 

9. Mitroi, F.C., Niculescu, C.P.: An Extension of Young’s Inequality. Abstract and Applied Analy- 
sis, Article ID 162049, vol. 2011, p. 18 

10. Niederreiter, H.: Random number generation and Quasi-Monte Carlo methods. Soc. Ind. Appl. 
Math. (1992) 


9 The Riemann Integral 


. Weyl, H.: Ueber die Gleich-verteilung von Zahlen mod. Eins. Math. Ann. 77(3), 313-352 


(1916) 


. Kuipers, L., Niederreiter, H.: Uniform Distribution of Sequences. Dover Publishing, Mineola 


(2012) 


. Diaconis, P.: The distribution of leading digits and uniform distribution mod 1. Ann. Probab. 


5(1), 72-81 (1977) 
Tolstov, G.P.: Parametric differentiation, and the restricted Denjoy integral. Mat. Sb. 53, 
387-392 (1961) (Russian) 


. Bartle, R.G.: A Modern Theory of Integration. Graduate Studies in Mathematics. American 


Mathematical Society, Providence (2001) 


. Gordon, R.A.: The Integrals of Lebesgue, Denjoy, Perron and Henstock. Graduate Studies in 


Mathematics. American Mathematical Society, Providence (1994) 


. Lee, P.Y., Vyborny, R.: The Integral: An Easy Approach After Kurzweil and Henstock. Cam- 


bridge University Press, Cambridge (2000) 


. Bartle, R.G.: Return to the Riemann integral. Am. Math. Mon. 103, 625-632 (1996) 
. Cousin, P.: Sur les fonctions de n variables complexes. Acta. Math. 19, 1-62 (1895) 


Dunham, W.: The Calculus Gallery: Masterpieces from Newton to Lebesgue. Princeton 
University Press, Princeton (2005) 


Chapter 10 
Improper Riemann Integrals 


The Riemann integral applies only to bounded functions defined on compact 
intervals. This severe restriction can be relaxed by considering larger concepts of 
integrability. The simplest one in this respect is improper integrability, that refers to 
the complex-valued functions defined on noncompact intervals A that are Riemann 
integrable on every compact subinterval. For such functions, the integral is defined as 
the limit of Riemann integrals over an increasing sequence of compact subintervals 
whose union is A. The simplicity and usefulness of this extension make it a valuable 
introduction to more advanced theories of integration like Lebesgue integral and the 
Kurzweil—Henstock integral (the generalized Riemann integral). 


10.1 Some Basic Facts 


Throughout this section, we deal with locally integrable functions, that is, with 
functions integrable on every compact subinterval of the domain. Examples of such 
functions are the continuous functions and the monotonic functions, but the reader 
should be aware that this class is much richer, including nonmonotonic functions 
with infinitely many discontinuities such as the fractional part. 

Suppose that f is a locally integrable complex-valued function defined on an 
interval of the form [a, b), with —co < a < b < ov. If the limit 


b v 
[fea = tim [ Fleas 


exists in C, then we say that f is Riemann integrable in the improper sense and that 

i : f (x) dx is a convergent integral. Otherwise, [ ta f (x) dx is said to be divergent. 
Taking into account the property of additivity of the Riemann integral, it is easy 

to see that the integral tog J (x) dx is convergent if for some c € (a, b), the integral 
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of f over [c, b) is convergent; in this case, 


b c b 
[reas [rears [| peyas 


Notice that the convergence of the integral [ bs Ff (x) dx yields the convergence of 


all integrals vig Ff (x) dx with c € (a, b), and also the above additivity formula. 

A similar approach concerns the case of locally integrable functions f defined on 
intervals of the form (a, b], with —oo <a < b < ow. In this case, the integrability 
of f means the existence of the limit 


b b 
[fea tim, | f(x) ae. 


When f is a locally integrable function defined on an open interval (a, b) with 
—oo <a <b<o, wesay that f is Riemann integrable in the improper sense (and 
that tig Ff (x) dx is a convergent integral) if there is a point c € (a, b) such that both 
restrictions of f to (a, c] and [c, b) have this property. In this case, the integral of 
f over (a, b) is defined as the sum of the two integrals, 


b c b 
| fone = f rorar+ f rooar, 


The correctness of this definition (that is, the independence from the particular inter- 
mediate point c € (a, b)) follows from the property of additivity of Riemann integral. 

All important tools in the case of Riemann integral (Leibniz-Newton Formula, 
Integration by Parts Formula, Change of Variable Formula etc.) extend to the 
context of Riemann improper integrals. In stating and proving these theorems, 
we will consider here only the first case, of integrals over intervals [a, b) with 
—oo <a <b <o. The reader can easily adapt them to all other cases. 


10.1.1 Theorem (The extension of Leibniz-Newton Formula) Let f be a locally 

integrable function defined on an interval [a, b), that admits an antiderivative F. 

Then f is integrable in the improper sense if and only if the limit F (b) = _ F(x) 
> 0 = 


exists. The integral of f is given by the formula 


b 
[foe = FO - Fa, 
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10.1.2 Example (a) For each c € [0, 1), we have 


c 
dx c 
= —-2V/1—-—x 
ir 0 


=2-—2V1—-c, 


hence the function f(x) = 1/./1 — x is Riemann integrable in the improper sense 
on [0, 1) and 


1 
dx 
= lim 2Q-—2V1-—c)=2. 
[fs nena ) 

0 


(b) When b > Oand p € R, the function 1/x? is integrable in the improper sense 
on (0, b] if and only if p < 1. Moreover, 


1 bi-P 
[wer ‘= for p < 1. 


Indeed, if c € (0, b], then 


b bi-P l-p 
/ ae Ke eae 
logb—loge ifp=1, 


so that lim ‘4 de exists in R if and only if p < 1. 
c> 0: : 


(c) When a > O and p € R, the function f(x) = 1/x? is Riemann integrable in 
the improper sense on [a, oo) if and only if p > 1. In addition, 


CO 
dx a? 
—= for p > 1. 
xP p-Il 


a 


(d) The function e~!*! is Riemann integrable in the improper sense on R and 


oo 0 oo 

—|x| — x —x — ox|0 a, HOO 
Je av= f cart fe dx=e|_.-e “ly = 
—c —0o 0 
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(e) The function f(x) = cos x is not integrable on [0, co) because the function 


¢ 
Io) = f cosxdx = sine, c>0, 
0 


has not limit at oo. 
Knowing that some functions are integrable, can help us decide that other functions 
are integrable as well. 


10.1.3 Theorem (The Comparison Test) Suppose that f,g : [a,b) — R are two 
locally integrable functions such that 0 < f < g. 

(a) If g is Riemann integrable in the improper sense, then f is also integrable 
and 


b 


b 
o<f pareve f garar. 


a 


(b) If f is not Riemann integrable in the improper sense, then nor is g. 
An important example is offered by the function f(x) = e* */2 for x € R. Since 
this function is continuous, it is also integrable on any compact interval, in particular 
n [—1, 1]. On (—oo, —1] and [1, oo) it verifies the double inequality 


1 2 
<1 a 
1454¢h(5) + 


which yields the integrability of f on these intervals. For example, the limit 
gon rE li f(x) dx exists because the function 6 > i? f(x) dx is increasing and 
00 


bounded by [ a 4, dx = 2. See Example 10.1.2 (c). From the integrability on 
(—oo, —1], [—1, 1] and [1, oo), we infer the existence of the limit 


for) —1 1 B 
/ f(x) dx = tim | Fo) dx + / f(x) dx + Pate [re dx. 
—oo a | 1 


Pierre Simon Laplace has proved that 
[o,@) 
/ e* dx = JT. 
—oo 


See Exercise 15 for a simple argument. This integral has many applications in prob- 
ability theory, statistics, mathematical physics, mathematical finance etc. 
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10.1.4 Theorem (The extension of Absolute Value Inequality) Suppose that f, g : 
[a, b) > Rare two locally integrable functions such that | f| < g and g is Riemann 
integrable in the improper sense. Then f and |f | are also Riemann integrable in the 
improper sense and 


b 


b 
| feoa </ IfGo| dx. 


a 
Notice that by Theorem 10.1.3 we also have 


b 


b 
J ireolars f gevae. 


a 


The improper integrals [ ‘a f (x) dx with property that [ : | f (x)| dx is convergent 
are called absolutely convergent. According to Theorem 10.1.4, every absolutely 
convergent integral is also convergent (a result that reminds us of the case of numerical 
series). 

The next three results provide other instances of convergence of integrals. 


10.1.5 Theorem (Integration by Parts) Let f, 9 : [a, b) — C be two piecewise C! 
functions such that the limit (f g)(b) = pm Ff (x)g(x) exists. Then, if one of the 
x—>b— 


functions f'g or fg! is Riemann integrable in the improper sense, the other one is 
also integrable and 


b b 
/ f'(x)g(x) dx = (fg @2 — / f (x)g! (x) dx. 


10.1.6 Theorem (The Change of Variable in the Integral) Let f : [a,b) ~ R bea 

piecewise continuous function and ¢ : [a, B) > Ra piecewise C! function, strictly 

increasing and such that y(a) = a and aa g(t) = b. Then, if one of the functions 
t—> p- 


f or f ogy-q’ is Riemann integrable in the improper sense, then so is the other one 
and 


b B 
| sous [ reo-e'ou. 


We conclude this section by presenting a criterion of integrability which applies 
to functions whose absolute value is not necessarily integrable. 


10.1.7 Theorem (Abel—Dirichlet Test of Integrability) Let f : [a,oo) > C bea 
piecewise C! function such that 
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lim f(x) =0 
xXx>CO 

and | f'| is Riemann integrable in the improper sense. Let g : [a,oo) > C be 


a piecewise continuous function which has a bounded antiderivative. Then, f g is 
Riemann integrable in the improper sense. 


Proof We will use Cauchy’s Criterion for limits. Let G be a bounded antiderivative 
of g and put 


M = sup {|G(x)|: x € [a, b)}. 


Let ¢ > 0. Since | f’| is Riemann integrable in the improper sense, there is 5 > 0 
such that for g > p > 6, we have 


é€ 


q 
Jiro <x. 
p 


Since lim f(x) = 0, we may assume (by increasing 6 if necessary) that also 
x CO 


q>pz26 implies  sup{|f(p)|, |f(@)|} < ¢/4M. 


Consequently, for g > p => 4, we have 


q q 
| feogwax = | feo'eoas 
P Dp 


q 
< 2M - supt| f(P)|, raon+m | f(x) | dx 
Pp 


<é/2+6/2=6, 


and the proof ends by applying Cauchy’s Criterion 6.2.5. 


The Abel—Dirichlet Test of Integrability yields easily the convergence of Euler’s 
integral, 


sin x 


Indeed, taking into account the Riemann integrability of the function == over 
[0, 1] (no matter how it is defined at x = 0), we can reduce the problem to the 
convergence of the integral [ a, sin dx. This results from the Abel—Dirichlet Test 
of Integrability applied to g(x) = sinx and f(x) = i. An alternative argument is 
offered by Exercise 14. One can prove that 
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See the note after Exercise 5, Sect. 11,7. 
The absolute value of the function “** is not integrable on [1, 00). To see this, 
let us assume that the contrary is true. Then by the Comparison Test of Integrability 


(Exercise 1 below) we infer that == 


x 
above, using the Abel—Dirichlet Test, we infer that cones is integrable on [1, 00), 


which implies the integrability of the function 


gt Di a =nae dy < a 
sin’ x — Liggest is also integrable on [1, oo). As 


1 l—cos2x  cos2x 
= + 


2x 2x 2x 7 


a contradiction. Therefore | sn = | is not integrable. 


Exercises 


1. Prove the Comparison Test of Integrability as stated in Theorem 10.1.3. 

2. Suppose that A is a noncompact interval and that f : A — C is a locally 
integrable function. Prove that f is Riemann integrable in the improper sense if 
and only if Re f and Im f both are integrable in the improper sense. Moreover, 


J feoar= f Reporax +i fim foayae. 


A A A 


io) 


. Compute the integral [>° e~@*!””* dx, where a > Oandb eR. 

. Prove Theorem 10.1.5, concerning the integration by parts. 

5. Prove Theorem 10.1.6, concerning the change of variable in Riemann improper 
integrals. 

6. Compute the integrals 


‘KR 


m/2 oo 
cos” x 1 
i= ————_—— dx and J = | ——~———. x. 
sin” x + cos” x (+ x72)(1 +x”) 
0 0 


[Hint: For I, use the change of variable x = 5 — 1. The second integral reduces 


to the first one via the change of variable x = tan f.] 

7. (Cauchy’s Integral Test). Let f : [1, 00) — [0, 00) be a continuous decreasing 
function. Prove that the series >*,, f (n) is convergent if and only if f is Riemann 
integrable in the improper sense. 

8. Use the result of the previous exercise to show that the generalized harmonic 
series 5°, 1/n® is convergent if and only if a > 1. 

9. (The translation invariance of the improper integral). Let f : R > C bea 
function which is Riemann integrable in the improper sense. If € R, prove that 
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10. 


11. 


12. 


13. 


14. 


15. 
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[ fetnars [ f(x) dx. 


(Barbalat’s Lemma [1]). Suppose that f : [0, co) — Ris uniformly continuous 
and Riemann integrable in the improper sense. Prove that f(t) > Oast — oo. 
Compute the integrals 


CO 
1 = | — 
= —_— = GA 
"ae J A+x2)" 
0 


by observing that 7) = 1 and [,41; = 25 Tis 


Prove the existence of the following integrals 


lore) lore) b 
ee 2 dx 
sin x~ dx; cos x~* dx; ——— 
V(x — a)(b — x) 
0 0 a 


and compute the third one. 
Note The first two integrals are special cases of some formulas proved by A.-L. 
Cauchy in 1815: For all t € R, 


CO 
/ - 29 1 as 12 : t? 
sin x* costx dx = | cos sin 
2V2 4 4 


0 


and 


[oe] 
/ " 1 [x - ft 
cos x* costx dx = ~,/—|]|cos{ —}+sin{ — })]. 
2V2 4 4 
0 


(Approximating integrals by series). Suppose that f: [0,0co) — [0, 00) isa 
decreasing integrable function and (a,,),, is a sequence of positive numbers such 
that lim “ =k > 0. Show that 

noo 


i7, ; = 
~ | foe = im, h >= f (han). 


0 n=1 


00 ; 
Prove that the series }°_[, a s* dx verifies Leibniz Alternating Series Test. 
n= 
Then, infer the convergence of the integral ie sm* dx. 


(Computation of the probability integral [2]). Consider the function 
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' x\n 
F@) =e7/" — (1 = =) for x € [0,7] 
n 


(n being a positive integer). Prove that: 

(a) e~' > 1 —t for every t € [0, 1] and deduce that F(x) is nonnegative; 

(b) the function xe~* attains its maximum at x = 1; 

(c) the maximum of the function F(x) is attained at an interior point xg € (0, 7); 


(d) the equation F’(xo) = 0 yields e~*® = (1 — ee so that 


’ 


= 
F (x9) =e  —e (1 - ~*) sa 


n n 


(e) the function F(x) verifies the estimate 0 < F(x) < _ 


ne 
According to (e), 


which yields (via the change of variable x = ./n sin f), 


ce vn AVN 
x2 ‘i x 
fe ax = im, f (1- =) dx 
n—>0oo n 
0 0 


wf 


= lim Un f cos?"*! vax, 
0 


no 


The integrals I, = i * cos2"+! x dx can be computed by recurrence. 
Therefore 


[o.@) 
= 42 . 2n 2n —2 2 
dr =i . 2 
[e 7 Jim, (va wed wal 3 
0 


li Jn 2:2 4-4 2n-2n 
= lim ; : oe 
n>oo\ ./2n + 1 1-3 3-5 (2n — 1)-(2n+ 1) 


according to Wallis’s Formula. See Exercise 8, Sect.9.2. 
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10.2 The Euler—Maclaurin Formula 


In this section, we describe a powerful connection between integrals and finite sums 
that can be used to evaluate with high precision the finite sums by integrals (and 
conversely, the integrals by finite sums). The basic ingredient is integration by parts, 
performed in a framework involving the Bernoulli polynomials B, (x). According to 
Sect. 8.7, these polynomials can be defined recursively as follows: 


Bo(x) = 1 
1 
Bi (x) = nB,_1(x) and [bao dx =0 forn> 1. 
0 


Thus, 


Bo(x) = 1, BiG) =x 5, Bw) =x xt 5, Ba) = —Sx? +5 
OM) H 1, BIX)=xX 5° W(x) =x x. 6’ 3(x) =x a" 5 


and so on. Moreover, 
B,(O) = B, (1) forn > 2. 


We can associate to each Bernoulli polynomial, a periodic Bernoulli function (of 
period 1) via the formula 


B, (x) = B,(x—|x]), forn €Nandx eR. 


Equivalently, Bn (x) is extension by periodicity of the restriction of B, (x) to [0, 1). 
Except for n = 1, all these functions are continuous. 


10.2.1 Lemma (The basic Euler—Maclaurin Formula) For all integers n > m > 0 
and all functions f € C! ({m, n]}), we have 


ve = [ sora rete fe (x) Bi (x) de. 


k=m 


Proof Indeed, 


n—-1 k+1 


+ 16 - [sou-3 J (ft = Feo ek = 1 ax 


k=m k=m k 
n—1 k+l 


=> f foe k-yas 


k=m k 
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n—-1 tl _ 
= pS / f'(@) (: Lx] ;) ax 5 > / P@idy 
k=m % ane 
=f fcoBicyar + = 


and the proof ends by adding f (7) to both sides. 


Suppose that f belongs to C!({1,oo)), lim f(n) = 0, and the integral 
noo 


f . | es (x)| dx is convergent. In particular, this is the case of the function 1 when 
restricted to [1, 00). By Lemma 10.2.1, 


drw= f pears OPPs f pepdicoa 
1 1 


k=1 


n 1 love) ; 
=f fear try + stn) — f f/@)BG) ax i 
1 n 


where 


1 


yea att / f'(x) Bi (x) dx 
1 


is the so called Euler—Maclaurin constant of f. The absolute convergence of the last 


integral follows from the fact that | f' (x)B 1 (x)| < , | f' (x)| and the convergence 
of the integral [°° | f’(x)| dx. Due to the fact that lim f(n) = 0, 
n—->oo 


jim. (re = f ronar) = vy. 
k=1 1 


In the case where f(x) = i, x € [1, oo), the Euler—Maclaurin constant of f is 
Euler’s constant y = 0.577215664... . 

If f is of class C? with p > 1, then repeated integration by parts will provide 
better results. 


10.2.2 Lemma (Euler—Maclaurin Formula of third order) For all integer numbers 
n>m > Oandall functions f € C3 ({m, n]), we have 


YF = f pear OSLO, LOL Rep mn) 
k=m m 
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where the remainder R(f; m,n) verifies the estimate 


1 


|R(f;m,n)| < mf iol dx. 


Proof Indeed, according to Lemma 10.2.1, we have to evaluate rs . i ii (x)B 1(x) dx. 
This is done as follows: 


n n wei k+1 
[ Fcosinax =f row = |x] = sax = pa | f'@)—k- sax 
m m =m 
1 n—-1 kt] 
=5 S } f'(x) Bh (x — k) dx 
k=m k 
Ro pl k+1 ; 
= OS (ret =F) -5 > f some - war 
k=m k=m k 
OTL ae = ee 
= ic - Dd | FC) BS @ — & dx 
k=m k 
POUvey te 
7 5 aff (x) B3 (x — k) dx 
k=m k 
! = ~} 1 n 7 
= f'() =! (m) ; [robs (x) dx. 


The above computation gives the formula in the statement of Lemma 10.2.2 with 
1 n 
BO ar ‘| f(x) B3 (x) dx. 
m 


According to the mean value theorem for integrals, 


1 n 
IR(fs m,n) < & (ax, Bac) f [reo] dx, 


and the proof ends by noticing that max |B3(x)| = 5: See Exercise 1. 
O0<x<l 
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For f(x) = i, x € [1, co), andn > m = 100, the remainder R, = R(/f; 100, n) 
verifies 


Then 


99 nH 
1 dx 1 1 1 ‘ 1 an 
k x 200 2n 12-n2 ~~ 12-104 i‘ 


4 1 1 
= Sale ON Gye a 
( Rete 795 T as a) 


) + Rn (10.1) 


where the expression in the parenthesis (10.1) was truncated at nine places. Conse- 
quently 


10001000 1 
D7 =0-577215664 + 1000 log 1000 
k=1 
= 6908.332494646 


is correct with seven decimal places. In fact, 
-9 —8 -—9 1 —7 
10-7 + 1.7 x 10°° + 10 a : 


It is worth noticing that the argument of Lemma 9.8.1 yields (via mathematical 
induction) the following odd-order derivative version of Euler—Maclaurin Formula: 


10.2.3 Theorem (General form of Euler—Maclaurin Formula) For all integer num- 
bers n > m > 0 and all functions f € C??*! ({m,n], R), we have 


_ i fim + fr) Bx (2k—1) (2k-1) 
d f= f poyar+ OEE +2 apy [fa = £m) | 


k=m 
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n 


1 es 
+ @p+b! / FOPTD (x) Bop at (x) dx. 
m 


The general form of Euler—Maclaurin Formula encompasses many useful formu- 
las including Bernoulli’s Power Sum Formula. Indeed, form = 1 and f(x) = x* 
(with s a positive integer), we obtain 


Exercises 


1. Prove that max |B3(x)| = 1/20. 
O<x<1 


2. (Maclaurin-Cauchy). If f : [1,co) — R is positive, continuous, and tends 
monotonically to 0 as x — oo, then the limit 


Vf 7103 F(k) - i rou) 
k=1 
exists. 


3. (a) Infer from the basic Euler—Maclaurin Formula that 


n n 
logk 1 1 1 
py = [avec+ O88 4G ogn 
k x 2n 2 
1 


n 
k=1 
logn logn 
=C+ 5 los? n+ +0 , 
2n n2 
_ a. n  logk 1 
where C = slim ( k=1 5 log? n). 


(b) Use this fact to show that 


2 
sv ae ~ 23 
k=1 


k=1 


1 
= (y — = log2)log2+ 0 (<). 
2 n 


foe) 
(c) Conclude that >° (—1)” logk =(y 5 log 2) log 2. 
k=1 
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4. Prove that the integral [, . Bi) dx is convergent. The value of this integral is 
log /2z — 1; see the next exercise. 


[Hint: Notice that i Bus) dx is the sum of the first 2(7— 1) terms of an alternating 
series with terms of decreasing magnitude. ] 

5. (A new proof of Stirling’s Formula). According to Wallis’s Formula (see 
Exercise 9, Sect.9.2), 


gan ((n)!)* 1 


im =-, 
noo ((2n)!)*(2n +1) 2 


which yields 
4n log 2 + 4log(n!) — 2 log ((2n)!) — log(2n + 1) > log -- 


The logarithms of factorials can be computed via the basic Euler—Maclaurin 


Formula, 
n 


I B 
logn! = nlogn—n+1+ ons | 1) ay, 


2 x 
1 


by putting /, Bi) dx = I —ry, where 


On ~ 
B © &B 
te) 1@) dx and ry, =} 1@) dx. 
x 1 


1 


The preceding exercise shows that r, — 0. 
(a) Noticing that 


logn +2427 —4r, —log2 — 2ro, — log(2n + 1) > log - 


prove that J = log /27 — | and 


OO 7 
l B 
logn! = mlogn n+ 1+ 5" 4+ tog V3 — 1 — f HD ge 
X 


n 


Conclude that 
nyvn 
ni ~ J27n (=) asn —> oO, 
e 


which is Stirling’s Formula. 
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(b) Estimate r, by noticing that r, = pam (Tn — Tn+1), and conclude that 


S2mn ec nl < Sant et 2-1/2) 


10.3 The Gamma and Beta Functions 


The gamma function I’ : (0, co) — R is defined by the formula 


Co 


ray | tle dt forx > 0. 
0 
The fact that the function f(t) = t*~'e7' is Riemann integrable in the 


improper sense on (0, oo) follows from the integrability of the functions f(t) = 
(1 e) xo,1)@) and fo(t) = (t*~! e~") x(1,00)(t). Indeed, in the case of fi (t) 
we have to notice that 


0O< fils 


ji=x xo) andl—x <1. 
The integrability of f|(t) is now a consequence of the Comparison Test of Integra- 
bility (Theorem 10.1.3 above) and Example 10.1.2 (b). A similar argument works 
for f2(t), since 


Lx] Lx] ! 

t t ([x] + 2)! 

0O< fw < 7 7el2 = 2 , 
oe ogee ee peso t 
nro (lxJ42)! 


In order to outline an important property of convexity of the gamma function, we 
shall need the following result important in itself. 


10.3.1 Lemma (The Rogers-H6lder Inequality for improper integrals) Let p,g € 
(1, co) with 1/p+1/q = 1, and let f, g : [0, 00) — C be two continuous functions 
such that | f |? and |g|4 are both Riemann integrable in the improper sense. Then f g 
is Riemann integrable in the improper sense and 


1/P / x 1/q 


/ fpax| < / fl? dx / git ax) (10.2) 
0 0 0 


For p = q = 2, the above inequality is known as the Cauchy-Buniakovski-Schwarz 
inequality. 
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Proof Put 


1/p oo 1/q 


[o.2) 
Illa = (ff? ae ee ee Jigitas 
0 0 


If || f llz» = 0, then f is identically 0 and the inequality in the statement is trivial. The 
same applies when ||g||;¢ = 0. Suppose now that both these numbers are nonzero. 
According to Young’s inequality (Exercise 2, Sect. 8.2), we have 


IFC WO 1 LF, 1 leo? 
[fle gle ~ p fli» 4 lgllge 


for all x € (0, 00). According to the Comparison Test of Integrability, the function 
fg is absolutely integrable, so by integrating both sides, we get 


1 CO 
Ey a Pe ee 
II Fle» aa! 


The proof ends by taking into account Theorem 10.1.4. 


10.3.2 Theorem The gamma function has the following properties: 
(a) F(x+1) =xT (x) forx > 0; 

(b) Pd) = 1; 

(c) I’ is log-convex. 

Proof (a) Using the integration by parts, we get 


oe) [o,@) 
Tot+lh= / Pe tdt= (- aa e') lo” +x f tle! dt = xT (x) 
0 0 
for every x > 0. The property (b) is obvious. 


(c) Let x, y > Oand let A, ~ > O with A + x = 1. Then, by the Rogers-H6élder 
Inequality, 


[o@) lo.) 
rox tay) = ft pxtuy—loemt gp — Ge -t ) (poten) ae 
0 0 
[o.@) Co 
e fetera) pe la~* di eee 
0 0 


The gamma function is an extension of the factorial function. 
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10.3.3 Corollary [(7+1)=n! foreveryn EN. 


Fig. 10.1 The graph of the 
function I on the interval 
(—4, 4]\{0, —1, —2, —3} 


10.3.4 Corollary The gamma function is convex, continuous and x1 (x) approaches 
lasx > 0+. 


IT has a minimum on (0,00) at 1.461632145... ; this was first noticed by 
C.F. Gauss. See Fig. 10.1. 
The gamma function is the unique log-convex extension of the factorial function: 


10.3.5 Theorem (H. Bohr and J. Mollerup) Suppose that f : (0,00) > Risa 
function which verifies the following three conditions: 


(a) f(x + 1) = xf (x) for every x > 0; 


(b) Ff) = 1; 
(c) f is log-convex. 
Then f =T. 


Proof By induction, from (a) and (b) we infer that f(m + 1) =n! for everyn EN. 
Now, let x € (0, 1] andn € N*. Then by (c), 


frat1l4tx)= f(d-xa+1+x(n4+2)) 
< f' a4) - ffa+2) 
= f'*nm41)-t)*: fi(nt) 
=(n+1)*- fa@+) 
=(n+1)*-n! 


and 


al=f(n+)=f@an+x)+U-x)+14+)) 
< f'(n+x)- f'*(nt+14+x) 
=(ntx)*- fi(nt1+x)- fi %nt+1+x) 
=(n+x)*-f(n+1+x). 
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Thus, since f(n+14+x)=(1+x)(21-—1+4+%)---xf (x), we obtain 


(14%)'< +9149) Xf) (1+ ) 
n nin* n 
which yields 
nin* 
f(x) = lim for x € (0, 1]. 


n>oo(n+x)(n-—1+x)---x 


We shall show that the above formula is valid for all x > 0. For this, consider 
the integer number m such that 0 < x — m < 1. According to (a) and what we just 
proved, we get 


f(x) = (« -1)---@ —m) f(x —m) 


nin*—™ 


=(t—1)---@—m)- lim | Cena | eee eee ea eee 


— lim ( nin* eevee) 
n>o \(n+x)(n-—1+x)---x n™ 

— lim n!n* <i (a+ 5a+7=5.-04+="*))) 
n>o(n+x)\(n—14+x)---x n>00 n n n 


. nin* 
= lim . 
n>o (n+x)\(n-—1+x)---x 


Therefore f is uniquely determined by the conditions (a), (b) and (c). Since T° 
satisfies all these three conditions, we must have f =I. 


! x 
10.3.6 Corollary (CE. Gauss) Pox) = im, 7 er for every 
n> n KI = XdporsX 


x>0. 


10.3.7 Remark From the recurrence formula C(x + 1) = xI'(x), we infer by induc- 
tion that 


T(axtn)=x(x4+1)---@+n—-1)F(x) forallx > OandneN, n> 1. 


This allows us to extend the gamma function to (—oo, 0)\ {—1, —2, ...} via the for- 
mula 

I(x +n) 
x(x + 1)---(@+n-—1)’ 


T(x) = 


where n is positive integer greater than —x. See Fig. 10.1. 


The following result provides a fundamental identity linking the gamma and the sine 
function: 
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10.3.8 Theorem (Euler’s Reflection Formula) For every x in (0, 1), 


Ta@)rd—-x) = 


sinax 


Proof 1n fact, by Corollary 10.3.6 and Euler’s infinite product for the sine function 
(Theorem 8.7.1 above), we infer that 
ni n* nin!-* 
lim 
n>o(n+x)\(n—1+x)---x(n+1—x)(n—x)---d—x) 
1 a 


2 sinx 
x [] (1 = ) 
k=1 


10.3.9 Corollary [(1/2) = /7z. A variant of the last corollary is the formula 


T(x) Td — x) 


1 CO 
2 

—— / ee dx = 1, (10.3) 

J 21 =< 


which appears in many places in mathematics. Another interesting consequence of 
Theorem 10.3.5 is as follows: 


10.3.10 Theorem (The Gauss—Legendre Duplication Formula) 


1 
r(5)r cal = VF regs for every x > 0. 
2 2 Qx-1 


Proof Notice that the function 


Qx-l x x+1 
foy= =r (S)r( 5 ) x>0, 


verifies the conditions (a)-(c) in Theorem 10.3.5 and thus equals I. 


Stirling’s formula, 
nl ~ JS2nn"t1/2 en 
admits an extension for the Gamma function. 
10.3.12 Theorem (P.S. Laplace) P(x + 1) ~ /2mx7**!/2e-* as x > on. 


Proof Our argument is based on the property of log-convexity of the Gamma func- 
tion: 


T((1 —A)x + Ay) < P(x)!“ P(y)* for all x, y > O and A € [0, 1]. 
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This is used to estimate the function 


T(x + le* 
f@~= ae 


for large values of x. 
Indeed, if x is a positive number, then |x|] + 1 < «+1 < [x] +2, which allows 
us to represent x + 1 as a convex combination, 


x+1=(1— {x} (Lx) + 1) + {x} (Lx] +2). 
As a consequence, 


Tet) s(x] typ rds) +2") 
= (Lx) (Ley + DN) 
= (Lx) (J+, 


which yields, 
Lx]! (LxJ + 17H! elles 
et) 1 fp r/d-bny 
ru ((1+ 2) ) 
Gas, : 
[x] 
elt} el—tr} 2 
= f(lx))——Vg = fe) (10.4) 
(1 if x) (1 xt) 


for every x > 1. In the same way, taking into account that 
lx] +1 ={x}x+U-fp)@arth, 
we obtain 


Ix!!=Pdx] +) <T@re@+ py = ane 


That is, |x|!x*—4*! < T(x + 1), whence 


T(x + le* ()" e 
= SS — ————— ees 
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Ix] 1/2 
= f(x) (=) (10.5) 
for every x > 1. 
By Stirling’s formula, lim /f (|x|) = V2z7,, so by (10.4) and (10.5) we conclude 
xXx 0O 


that 


Pees ie 
ig ae ae 


x 300 = xXt1/2 


Closely related to the gamma function is the beta function B, which is the real 
function of two variables defined by the formula 


1 
Bix, y)= / a —1)"! dt forx,y > 0. 
0 


10.3.13 Theorem The beta function has the following properties: 


(a) B(x, y) = B(y, x) and B(x +1, y) = ae B(x, y); 


(b) B(x, y) is a log-convex function of x - each fixed y; 
(©) Bo, yy = EO 
, Tat+y)- 


Proof (a) The first formula follows by the change of variable, uv = | — t. For the 
second one, 


1 
Batty f ra lar 


0 
: Xx 
¢ 

=) ti-19 "| ——)] & 

[a-o Ges 

0 

G9 f * Nf a 
(a) ef seo 

x+y 1-t 0 x+y 
0 

—- — By, y). 

x+y 


(b) Let a,b, y > O and letaA, wu > O with A + uw = 1. By the Rogers-Hélder 
Inequality, 
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1 


Bika + pb, mie ere eee dt 


oO 


Xr 1 lL 


Po a=1) at pr —t)ldt 


0 


IA 


= B*(a, y)- BY(b, y). 
(c) Let y > 0 arbitrarily fixed and consider the function 


D(x + y) Bx, y) 


gy(x) = ro) > 


Then @y, is a product of log-convex functions and so is itself. Also, 


Tawt+y+1)Bat+l,y) 
ry) 
_l@+yT@t+y il [x/(x + y)] Bx, YY ez 
TO) X Py(x) 


gy@tl= 


for every x > 0 and 


rd Bd, 
py() = s fo- yy tdt = 1. 


Thus, gy = I" by Theorem 10.3.5 and so 


Mw) T(y) 
Ba,y)=———. 
Paty) 
Exercises 
1. Show that 
1 2n)! 
Tn+=)= age for everyn EN. 
2 n! 4” 


2. Infer from Theorem 10.3.5, Weierstrass’ formula for the gamma function: 


-yx © 


FL (+3) e* 
n ? 


n=1 


Tx) = 


where y is Euler’s constant. 
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3. Infer from the preceding exercise and Theorem 8.6.2 (on term by term 
differentiability of series of functions) that the gamma functions is differentiable 
on every interval [¢, oo) with e > 0 (and thus, on the whole interval (0, o0)). 

4. Prove the formula 


CO 


d? 1 
—, log I(x) = —., forx>0. 
qe (x) 2 G@ ae 


5. Use the log-convexity of I to show that 


ee ee a ae 
nal = 5 / 5 <5 or every 7 : 


6. Show that for all x > 0, 


x x+1 x+2 In V3 
r(3)r( 5 )r( 3 )- 3x PO). 


7. (L. Euler). Establish the following result: 


CO 
tr! 1 
/ dt = — forO <x <1. 
1+t sin x 
0 
8. Prove that 
m/2 
B(x, y)=2 / sin’*—1¢.cos*—!tdt for x, y > 0, 
0 
and deduce the formula 
m/2 
7) —  Qm)!a 
sin ie tdt = QmFT (mly2 ‘5 
0 


10.4 Notes and Remarks 


The remainder in the Euler—Maclaurin Formula was found by Siméon Denis Poisson 
in 1823. Our approach follows an idea due to Apostol [3]. For applications of this 
formula, see the papers of Boas Jr. [4] and Lampret [5], and also the book [6]. 

The gamma function was introduced by Leonhard Euler in 1729. However, the 
name gamma function and the symbol I’ were introduced by Adrien-Marie Legendre 
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around 1811. The Bohr-Mollerup Theorem was proved in 1922 by Harald Bohr and 
Johannes Mollerup and is remarkable by outlining the role of logarithmic convexity 
in the characterization of gamma function. See [7]. 

The proof of Stirling’s Formula for the gamma function (Theorem 10.3.12) is 
borrowed from the paper of Dutkay et al. [8]. 

Stirling’s formula is not the only asymptotic formula for factorial n. Less known, 
but actually more accurate is the asymptotic formula of William Burnside, 


ee vin (" a8 a 


e 


An extension of this formula to positive real values was obtained by J.R. Wilton in 
1922: 


e+1/2 
r+ ~ vin (=) asx > oOo. 
It is worth mentioning that the argument used above to prove Theorem 10.3.12 can 
be adapted with minor changes to derive Wilton’s formula. 

A full extension of Riemann integral, of Lebesgue integral, and also of Riemann 
improper integrals, is provided by the theory of generalized Riemann integral. See 
the Notes and Remarks at the end of Chap. 9. Unfortunately, the extension to several 
variables of the generalized Riemann integral is quite intricate and this is a serious 
drawback. 

The Lebesgue integral (which has the feature that every integrable function is 
also absolutely integrable) is only a partial extension of Riemann improper integrals. 
Though slow in teaching, it covers perfectly all important applications of Analysis 
of one or several real variables. A quick survey of Lebesgue integral is presented in 
Chap. 11. 
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Chapter 11 
The Theory of Lebesgue Integral 


In 1901, H. Lebesgue published the paper announcing the discovery of the 
integral that now bears his name. Lebesgue’s integral extends Riemann’s integral and 
also the absolutely convergent Riemann improper integrals. Among its nice features, 
we should mention the completeness of the space of Lebesgue integrable functions 
and the existence of some rather general conditions under which the integral inter- 
changes with the derivative. In Appendix A, we will discuss the extension of the 
Fundamental Theorem of Integral Calculus (that is, the Leibniz-Newton Formula) 
to the framework of Lebesgue integral. 


11.1 The Measure of Elementary Sets 


Any presentation of the Lebesgue integral is based on some elements of measure 
theory. 

Given a bounded interval A of endpoints a < J), its length €(A) = b — a will be 
also called the Lebesgue measure of A and denoted A(A). 

The intersection of two intervals is again an interval, but the difference and union 
of intervals is not necessarily an interval. This fact leads us to consider the so called 
elementary sets , which are finite unions of bounded intervals. 

All elementary sets are bounded and thus, relatively compact. 

The set €(R) of all elementary subsets of R is closed under union, intersection, 
and difference of sets: 


A, B € E(R) implies AU B, AN B, A\B € E(R). 


An important fact is the possibility to represent each elementary set EF as a finite 
union of pairwise disjoint intervals. For example, when the elementary set is the 
union of two intervals A and B, such a representation is provided by the formula: 


AUB =(A\B)U(AN B)U(B\A). 
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The general case can be settled via Mathematical Induction. 
By definition, the Lebesgue measure of an elementary set A is 


(A) = >) ACAR), 


n 
where A = |) Ax is a representation of A as a finite union of pairwise disjoint 
k=1 
intervals. 
The correctness of this definition, that is, the independence of (A) from the 
particular representation of A as a finite union of pairwise disjoint intervals can be 


proved as follows. Suppose that 
A= Ua, and A = Us 
j=] 
are two such representations. If A itself is an interval, then clearly 


>) (Bj) = Do (Ce) = ACA). 


j=l k=1 


If A is not an interval, the aforementioned remark shows that 


gz 
m n n 
=> Ac — j Cx) 


j=lk=l k=1j=1 
n 

=>2 (C01 (8) |= YG. 
j k=1 


The Lebesgue measure of the elementary sets appears as a function 
A: E(R) = [0, 00), 


that extends the length of intervals and verifies the following two properties: 


(Meas1) A() = 0; and 
(Meas2) A is finite additive in the sense that 
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n n 
( U ) = > /A(Ay) 
k=1 k=1 


for every finite family (A;,)7_, of pairwise disjoint sets in € (IR). 
From these two properties and the fact that A > 0, we easily derive some other 
useful properties: 


e monotonicity, 
A,Beé€E(R) andA CB imply A(A) < A(B); 
e subtractivity, 
A,BeE(R) andACB imply A(B\ A) =A(B) — ACA); 


e subadditivity, 


n n 
( U a) < yA (Ax) for every finite family of sets of € (R). 
k=1 k=1 


The Lebesgue measure of elementary sets also verifies the property of regularity, 


A (A) = sup {A (K): K CA, K € € (R) compact} 
= inf {A(D): AC D, Dé €E(R) open}, 


which relates Lebesgue measure to topology. The proof is immediate by noticing 
that every bounded interval of endpoints a < b is contained in an open interval 
(a—&, b+6) of arbitrarily close length and contains a compact interval [a+e¢, b—e], 
of arbitrarily close length. 

We end this section with a glimpse of the general concept of measure. We start 
saying that a measure on a nonempty set Q is a positive function defined on some 
special classes of subsets of Q. 


11.1.1 Definition A ring of subsets of Q is any nonempty family R of subsets of 
(2, closed under the operations of union and difference. Necessarily, a ring of sets is 
also closed under intersection because of the identity 


AN B= A\(A\B). 


If moreover Q € 7, then FR is called an algebra of sets. 


If A is an algebra, then it contains together with a set A also its complementary 
set CA. 
E(R) is a ring of sets but not an algebra. 
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The terminology of ring of subsets is motivated by the remark that any such family 
C is a commutative ring with respect to the algebraic operations of “addition” and 
“multiplication” defined respectively by symmetric difference and intersection of 
sets. The symmetric difference of two sets is defined by 


AAB = (A\B) U (B\A). 


11.1.2 Definition A nonempty family A of subsets of Q is said to be a o-algebra 
of subsets if it verifies the following three properties: 


(ZA1) Gand Q belong to A. 
(ZA2) If A € A, thenCA € A. 


CO CO 
(ZA3) If (An)n is a sequence of elements of A, then L) Ap and (| A, also belong 
n=0 n=0 
to A. 


{O, Q} and P(Q) are examples of o-algebras of subsets of Q. 

Given a nonempty family F of subsets of Q, the intersection X&(F) of all 
o-algebras of subsets of Q that include F is still a o-algebra, precisely, the small- 
est o-algebra which includes F. We call &(F) the o-algebra generated by F. The 
o-algebra generated by the open subsets of R is called the Borel o-algebra (or the 
o-algebra of Borel sets) and is denoted (IR). It contains also the closed subsets, 
the Gs subsets, the F subsets, and the differences of these subsets. The o-algebra 
=X(R) coincides with the o-algebra generated by the elementary sets, that is, 


x(R) = X(E(R)). 
The theory of Lebesgue integral outlines a bigger o-algebra, 92(IR), which con- 
tains also all Lebesgue null sets. The details will be presented in Sect. 11.5. 


11.1.3 Definition Given aring of sets C, a finite additive measure onC is any function 
ju: C — [0, ov] that verifies the following two properties: 


(Meas0) .(@) = 0; and 
(MeasFA) for every finite family (Ax)¢_, of pairwise disjoint sets in C, 


“(Ua)= SH Ap)- 
k=1 k=1 


We call a measure o -additive if the condition of finite additivity is strengthened to 


(Measo A) whenever (Ax)x is a countable family of pairwise disjoint sets in C 
whose union belongs to C, 


«(U ) => pap: 


c= 1 k=1 
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Some simple examples of o-additive measures defined on o-algebras are as 

follows: 

e The Dirac measure concentrated at a point a € Q: 


1 ifaeA 


8a: P(Q) > [0,1], 84(A) = i ifa¢g A. 


The counting measure, 


nif Ais finite and has n elements 


#:P(Q) — [0,00], #(A) = ie fA is infinite, 


The discrete probability measure associated to a countable family Q of possible 
outcomes, denoted 1, 2,3,.... To eachn € Q it is attached a probability value 
Pn € [0, 1] such that >°,, pn = 1. An event is defined as any subset of the sample 
space 2. The probability of the event E is defined as P(E) = >) ,¢¢ Pn. This 
procedure gives rise to a o-additive measure 


P:P(Q) => [0, 1]. 


In the classical cases where 2 is finite (for example, in the case of coin tossing, 
rolling a die, drawing balls out of a urn, winning lottery etc.), P(E) is defined as 


Number of favorable outcomes 
P(E) = 


Total number of outcomes 


In the fair case, the probabilities of outcomes are the same and P differs from the 
counting measure by a multiplicative factor. 


Any triplet (Q, &, jz) , consisting of a nonempty set Q, a o-algebra & of subsets 
of Q and a o-additive measure jz defined on &, is usually called a measure space. 
When yz is a probability measure (that is, when (QQ) = 1), the measure space 
(Q, X, jw) is also called a probability space. 

One can prove that the Lebesgue measure of elementary sets is also o-additive. 
In Sect. 11.6 it will be shown that this measure has a unique o-additive extension to 
the o-algebra 92(IR), which verifies the property of regularity. 


Exercises 


1. Prove the properties of monotonicity, subtractivity and subadditivity of the 
Lebesgue measure of elementary sets. Notice that these properties are verified 
by any finite additive measure. 

2. (Poincaré’s Identity). Suppose that 44 : C — [0, oo) is a finite additive measure. 
Prove that for every finite family (Ax);_, of sets in C, we have 
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«(U ) =Die(Ad— Dp wR AD+- coma) a) 
k=1 k=1 k=1 


1<k <l<n 


In the case of a counting measure on a finite set, this identity is known as the 
Inclusion—Exclusion Principle. 


11.2 The Integral of Step Functions 


In this section, we will associate to the Lebesgue measure of elementary sets an 
integral. This integral is similar to Riemann’s integral of step functions, the difference 
being that we work here with functions defined on the whole real line. 

In what follows, by a step function we mean any linear combination with complex 
coefficients of characteristic functions of elementary sets. Necessarily, a step function 
is bounded and has only finitely many discontinuities (all of first kind). 

The set St(R) of all step functions is a commutative algebra with respect to the 
pointwise operations. 


11.2.1 Lemma Every function f € St (IR) admits a representation of the form 
n 


oar CkXA,» Where the sets Ax € E(IR) are pairwise disjoint. 
k=1 
Such representations will be referred to as canonical representations. 


Proof Notice first that 


CLXA, + C2XAy = C1XA,\Aa + (C1 + €2) XAyNAd + C2XAd\Aq- 


Then continue by Mathematical Induction. 


11.2.2 Corollary [f f € St (R), then|f| € St (R). 


n 
Proof Indeed, if f = >° cexa, is a canonical representation of f, then | f| = 
k=1 


n 
> lexlxag- 
k=1 


The set Sfp (R), of all real-valued step functions on R is also an algebra. With 
respect to the pointwise ordering, 


f <9 @ f(&) < g(x) for every x, 


Stp (R) is a linear lattice of functions. 
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11.2.3 Definition Let f € St (R) bea step function with a canonical representation 
n 


f = dX cexa,. We call the integral of f the number 
k=1 


[fa = >) ced (AR). 
R k=1 


The correctness of this definition (that is, the independence from particular canoni- 
cal representations of f’) can be established as in the case of the measure of elementary 
sets. 

The integral defined above gives rise to a functional, 


1:St(R) —C, 1(f) = f fax. 
R 


that provides a procedure to bring together the Riemann integration functionals 
(relative to the different compact intervals of R). Indeed, if f € St(R), then 
SF Xta,b) € St (la, b]) for every compact interval [a, b] and 


b 
[foe = f trtooree. 
a R 


Conversely, if g € St([a, b]) and gx{a,p] is its extension to R which vanishes outside 
la, b], then gx{a,p) € St(R) and 


b 
J sae = f xt dx. 
a R 


The main properties of the functional J are as follows: 


(Linearity) For alla, B € C andall f,g €St(R), 


I (af + Bg) =al (f)+ BIg); 


(Positivity) If f > 0, then I (f) => 0. 


(Calibration) Jf A is a bounded interval, then 
I (xa) = &(A). 


The properties of linearity and positivity imply the property of monotonicity, 
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f <ginStg(R) implies I(f) < 1(g), 


and this one, combined with the Absolute Value Inequality, yields the following 
Absolute Value Integral Inequality: 


[fe < f ifiax. (11.1) 
R R 


For the proof, use the canonical representation of f. If f vanishes outside a 
compact interval A, then 


[itias saa) sup |f I. (11.2) 
xE 
R 


The natural way to measure the distance between the elements of St (R) is pro- 
vided by a substitute of the notion of norm, called the L !_seminorm (or the seminorm 
of index L'). Precisely, it is the function 


IIrllz:S¢@)>R, Wflln = fitlar, 
R 


whose main properties are very close to those of a norm: 


(SNI) O< Ifllz1 < 00; 
(SN2) |loflln1 = lol IF llc 
(SN3) [lf + gllzi < filet + Iigllze- 


Nevertheless, it is possible that || f||,1 = 0 though f is not identically 0. See for 
example the case where f = x10}. 

Following the model of Euclidean spaces, we can introduce in St (IR) the notions 
of sequence converging to a certain function and of L'!-Cauchy sequence. 

A sequence (fy)n of step functions is said to be convergent in the L'-seminorm 
to a step function f if || fn — f\lz: > 0. 

The sequence ( fy,)n is said to be an L!-Cauchy sequence if for every ¢ > 0, there 
is anatural number N such that || fin — fnllz1 < ¢ whenever m,n > N. 

The notion of limit of a convergent sequence is not well defined because it is 
possible that || fr — f\lz1 — Oand || f, — gl|_1 — 0, though f 4 g. We will come 
back to this problem in the next section. 


11.2.4 Remark The formula 


4 (A) = f race, 
R 
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which relates the Lebesgue measure of elementary sets to the Lebesgue integral of 
step functions, establishes a certain equivalence of these two mathematical objects 
and will be used in Sect. 11.5 to extend the Lebesgue measure to a very large class 
of subsets of IR that can be “measured”. 


Exercises 


1. Prove that the set St(R) of all step functions, is a commutative algebra with 
respect to the pointwise operations. 

[Hint: Take into account that y4xB = xXAnB.-] 

2. Infer from the inequalities 11.1 and 11.2 that the Lebesgue integral of step func- 
tions has a unique extension to the linear space St(R), of all uniform limits of 
sequences of step functions, endowed with the sup norm (the closure of St(R) 
in €°(R)). Clearly, St(R) contains all regular functions in the sense of Defini- 
tion 9.1.5. Is the converse true? 

3. (Integral with respect to a finite additive measure). Let C be a ring of subsets of 
an abstract set T. A function f : T — C is said to be a C-step function if it is 
a linear combination of characteristic functions of sets belonging to C. The set 
St(C) of all C-step functions is a linear space that coincides with the space St(R) 
when C is the ring of elementary subsets of R. 


(a) Following the model of Lebesgue integral on €(R), prove that every finite 
additive measure  : C — [0, oo) gives rise to an integration functional 
I: St(C) > C which is linear and positive. 

(b) We denote J(f) by i f du. Prove the inequalities 11.1 and 11.2 in the 
context of integration with respect to ju. 

(c) Extend the integration functional to the space St(C), of all uniform limits of 
sequences of C-step functions, endowed with the sup norm. 


4. (Finite sums are integrals). Let w = (w 1, w2,..., W,) an n-tuple of strictly 
positive numbers (called weights) and consider the weighted counting measure 


Hy :P({1,2,...,n}) > [0,00), Hy (A) = >) wx. 


keA 
In this case, the elements of St(P ({1, 2,...,m}), R) are the arbitrary strings 
a = (d|,d42,...,d,) of n real numbers and the integral of such a string with 


respect to #w is given by the formula 


Formulate the analogue of Cauchy-Buniakovski-Schwarz Inequality in the case of 
the measure #w. What about the analogue of Chebyshev’s Algebraic Inequality? 
Compare to Exercises 6 and 7, Sect. 11.2. 


372 11 The Theory of Lebesgue Integral 


11.3 Lebesgue Integrable Functions 


In Sect.9.6, we introduced the notion of Lebesgue null set. Since it plays a major 
role in the theory of Lebesgue integral, we recall it here for the convenience of the 
reader. 

A subset X of R is called a Lebesgue null set (or a set of Lebesgue measure 0) if 
for every € > 0, there is a countable family of bounded intervals J; such that 


XC Bh I and > ee) <€. 
k 


Every countable subset of R is a Lebesgue null set. The same is true for Cantor’s 
triadic set. 

A countable union of Lebesgue null sets is still a Lebesgue null set. See 
Exercise 1. Clearly, every subset of a Lebesgue null set is itself a Lebesgue null 
set. 

A property P(x) depending on a real variable x is called true almost everywhere 
(a.e. for short) if it is true for all possible values of the variable except for a Lebesgue 
null set. 

Thus, we can talk about: almost everywhere defined functions, almost everywhere 
continuous functions, sequences of functions converging almost everywhere, almost 
everywhere limit of a sequence and so on. These notions will appear frequently in 
the presentation of Lebesgue integral. 


11.3.1 Definition A function f : R — C is said to be Lebesgue integrable if there 
exists a an L!-Cauchy sequence ( fy) of step functions converging pointwise almost 
everywhere to f. 

The sequences (f;,), will be referred to as approximating sequences. 


Clearly, all step functions are Lebesgue integrable and the same is true for the 
functions vanishing outside a Lebesgue null set. 

The set £!(R) of all Lebesgue integrable functions f : R > C isa linear space. 
We define the Lebesgue integral of a function f € £'(R) via the formula 


| far= tim, f fra, (11.3) 
R R 


where (fn)n is an approximating sequence for f. Indeed, according to the bound- 
edness property of Lebesgue integral of step functions (see inequality (11.1)), we 
have 


| fnae— f rox eles 
R R 
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which implies that the sequence (/ (fn))n is Cauchy in C. The independence of its 
limit from the particular sequence used to approximate f is based on two technical 
results. 


11.3.2 Lemma Let (fy), be an L'-Cauchy sequence of step functions. Then it con- 
tains a subsequence which converges pointwise almost everywhere, and satisfies the 
following additional property: given ¢ > 0, there exists a countable family (An)n 
of bounded intervals such that Sy £(An) < € and outside its union Z,, this subse- 
quence converges uniformly. 


11.3.3 Lemma [/f (fn)n and (gn)n are two L!-Cauchy sequences of step functions 
which converge almost everywhere to the same function f ,then 


lim | fn — Gn| dx = 0. 


n> © 


R 


As a consequence, 


im | j,dx= lm / Gn AX. 
n> © n> 
R R 


Lemma | 1.3.3, which solves the correctness of the definition of Lebesgue integral 
via formula (11.3), follows from Lemma 11.3.2. The proof of these lemmas makes 
the object of Sect. 11.4. 

The integral of a function f € £'(R) will be denoted also 


J fea or | feoe. 
R —oo 


11.3.4 Lemma Every function which differs from a Lebesgue integrable function 
on a Lebesgue null set is also integrable and has the same integral. 

In particular, the integral of any function f : R — C which vanishes outside a 
Lebesgue null set is 0. 


The integrability of complex-valued functions is equivalent to the integrability of 
their real and imaginary parts. 


11.3.5 Lemma /f f ¢ £'(R), thenRe f, Im f, f € £L'(R). Moreover, 


ea f dx, ee f dx and a ae 


R R 


Proof Tf (fn)n is an approximating sequence for f, then (Re f;,), is an approximat- 
ing sequence for Re f, the Cauchy condition being a consequence of the fact that 
[Re fin — Re fal < |fin — furl. which yields 
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[Re fin —Re falln < Wm — folly - 


Therefore Re f € £!(R). Taking into account that the convergence of a sequence 
of complex numbers means the convergence of its real and imaginary parts, we infer 


that 
Re | fax= [Re fax. 
R 


R 


Now the proof can be completed easily. 


The Lebesgue integral gives rise to a functional 


|e eae (0 ee OF rip= f fax, 
R 


which is linear and verifies the following condition of calibration: 
T(xa) = A(A) for every A € E(R). 


See Exercise 2. It is also positive: 


f €£!(R) and f > Oatall points imply [fe > 0. 
R 


Indeed, if (fn)n is an approximating sequence for f, then (Re f;,), is also an approx- 
imating sequence for f because f is real-valued. Taking into account that f > 0, 
we infer that actually ((Re f,)*)n is an approximating sequence for f. The limit 
of a sequence of nonnegative numbers is necessarily a nonnegative number and this 
ends the proof. 

Lemma 11.3.4 allows us to state the property of positivity in the following form: 


f € £' (R) and f > 0 almost everywhere imply / fdx > 0. 
R 


A consequence of the properties of linearity and positivity is the monotonicity: 


f,g €L' (R) and f < g almost everywhere imply ; fdx / gdx. 
R R 


11.3.6 Proposition (The invariance of Lebesgue integral under translation) [f f € 
L! (R) and 
T:R-R, T(x) =x+a, 
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is a translation on R, then the function f 0 T also belongs to L! (IR) and 


| eatin Pes 


Proof If A is a bounded interval of endpoints u < v, then 
A+ta={x+a:xeEA} 


is also a bounded interval, of the same measure. This remark together with the 
property of linearity of integral shows that the statement is true for the step func- 
tions. Now the general case can be easily established by considering approximating 
sequences. 


The change of integral under horizontal dilation is shown by the following formula, 


| fon dx = af f(x) dx, (11.4) 
R R 


valid for all f € £'(R) and A € R\ {0}. Consider first the case of step functions 
(straightforward computation) and then use the approximating sequences. 


11.3.7 Lemma (Absolute Value Integral Inequality) If f € L! (IR), then |f| € 


L! (R) and 
[ fas </ fl de. 
R R 


Proof For the first assertion, notice that if (f,)n is an approximating sequence for /, 
then (| fnl)n is an approximating sequence for | f|. Indeed, f, — f almost every- 
where implies | f;,| — || almost everywhere and the property of the sequence 
(| fnl)n of being L'-Cauchy follows from the inequality 


fil — fell < |.f) -— fel: 


by integrating both sides. 

The proof of the second assertion combines the definition of Lebesgue integral, the 
continuity of the absolute value function and the Absolute Value Integral Inequality 
for step functions: 
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| fas =| lim | faas = lim | fras 
n> co n—-> co 
R R 


R 
< lim | italax = [ If | dx. 
n—-> co 
R R 


The convergence in the space L! (R) is associated to the L!-seminorm (or the semi- 
norm of index L') which is defined by the formula 


Illi =} If | dx. 
R 


The general properties of the L!-seminorm are: 


(SNI) O< I fllz1 < 00; 
(SN2) llofllz1 = lel I fllne 
(SN3) [f+ glln: < filet + Igllze 


As in the case of step functions, we can speak in £! (IR) about sequences converg- 
ing in the L'-seminorm to a certain function and about L'!-Cauchy sequences, but 
the notion of limit is not well defined because it is possible that || f, — f||,1 — 0 
and || fn — gllz1 — 0, though f # g. However, f and g cannot differ each other 
too much because necessarily 


If —gllz1 =9, 


and under these circumstances we will prove in Sect. 11.5 that f = g except for a 
Lebesgue null set. 
A useful remark is the continuity of Lebesgue integral, 


fn > f in £'(R) implies / fax ip f dx. 
R R 
In fact, 


/ fudx— f fasl s | \fn— fldx= Ifa filly: > 0 
R R R 


The next result shows that the approximating sequences in the sense of 
Definition 11.3.1 are also approximating sequences in the L'!-seminorm. 
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11.3.8 Lemma /f f € £L! (R) and (en)n is an approximating sequence for f, then 


lim | |f —en|dx =0. 


n> Cw 


R 


Proof Let ¢ > O arbitrarily fixed. Then there exists a natural number N such that for 
every indices j,k > N, we have 


[leim el dx <e. 
R 


For k => N fixed, the sequence (|e; — ex|); is an approximating sequence for 
| f — ex|. According to the definition of the integral, 


pit-e dx = lim [leader se. 
J7o 
R R 


We are now in a position to prove an important feature of Lebesgue integral, 
precisely, the fact that every L!-Cauchy sequence in £'(R) is convergent. 


11.3.9 Theorem (on completeness of £!(R) ) If (fn)n is a Cauchy sequence in 
L'(R), then there is a function f in L'(R) such that || fn — f|\\,1 > 0. Moreover, 
a subsequence of (fn)n is almost everywhere convergent to f. 


Proof According to Lemma 11.3.8, for every natural number n, one can choose 


én € St (R) such that || fn — enllp1 < wr The sequence (€y) is L!-Cauchy. Indeed, 


IA 


lei — fil + fi — fell + fe — ells 


1 1 
i el ay ee 


Jer — ex ln 


as j,k > Ow. 

By Lemma 11.3.2, the sequence (e,), has a subsequence (ex(n))n that converges 
pointwise to a function f : R — C, except a Lebesgue null set Z. Since (ex())n 18 
an approximating sequence for f, it follows from Definition 11.3.1 that f € £' (R). 
According to Lemma 11.3.8, || f — exn) | ald 0. Therefore 


|| fem — fpr S || feo) — enon 1 + [exon — Fl 
1 


< sy thf eels > 9 
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as k > oo, which means that fxin) > f in L! (IR). Since 
life — fll < fe -— fem t+ lw - fi 


we conclude that the sequence (f,,),, is itself convergent to f in £! (R). 
For the second part of the statement, we can assume (by replacing (€x(n))n by a 
subsequence if necessary) that in addition 


1 
| fen) — ek(ny| < an? 


on a set X, whose complementary is a countable union of bounded intervals 


(oe) 
Ant, And. ... with 3 (Ang) < 1/2”. Then 
k=1 


Skin) — €k(n) > 0, 
uniformly on each of the sets 
Yyv = Xn XyqiN..., NEN. 


Notice that CY is a countable union of bounded intervals whose sum of lengths does 
not exceed 


Therefore, 


an = Sen) = (f = ek(n)) + (ekin) = fin) —>0 


pointwise, except the Lebesgue null set 


0(U Wy uz -( on Juz 


Theorem 11.3.9 has many important consequences. One of them is as follows: 


11.3.10 Beppo-Levi’s Theorem (Monotone Convergence Theorem for Integrals) 
Let (fn)n be an almost everywhere increasing sequence of functions in £' (IR) such 
that 


neN 


sup [ fn dx < oo. 
R 
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Then (fn)n is almost everywhere convergent to a function f € L! (IR) and 
lim / lf — frldx = 0. 
n— CO 
R 
In particular, 
lim / Stn dx -| f dx. 
n— oo 
R R 


Proof By the Monotone Convergence Theorem for real sequences, the sequence 
Che Stn Ax)n is convergent. For k > j, we have 


lie- Sil =f fe Sildx= f fede ff far, 
R R R 


which implies that (fy)n isan L'~Cauchy sequence. According to Theorem | 1.3.9, it 
follows that (f,)n is convergent in the L'-seminorm to a function fe £!(R) anda 
subsequence of it converges almost everywhere to f. As (fy) is almost everywhere 
increasing, we conclude that the whole sequence is almost everywhere convergent 


to f. 


11.3.11 Corollary Jf (fy)n is an almost everywhere decreasing sequence in L' (R) 
which converges almost everywhere to 0, then 


lim / fal dx = 0. 
n> co 
R 


Proof Apply Beppo-Levi’s Theorem to the sequence (— fy). 


Another immediate consequence of Beppo-Levi’s Theorem is the following converse 
to Lemma 11.3.4. 


11.3.12 Corollary If f <¢ CL! (R) and || f ||, = 0, then f = 0 almost everywhere. 


Proof Indeed, according to Beppo-Levi’s Theorem, the sequence of functions gy = 
n-|f|is almost everywhere convergent to an integrable function g. Therefore g,4) — 
gn — 9 — g = 0 almost everywhere. On the other hand, | f| = gn+1 — gp for all n, 
whence | f| = 0 almost everywhere. 


The next theorem provides a very suitable condition under which limits and inte- 
grals commute with each other. Applications will be discussed in Sect. 11.8, devoted 
to the dependence of integrals on parameters. 
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11.3.13 Lebesgue’s Dominated Convergence Theorem Let (fi) bea sequence 
of functions in £L'(R) which is almost everywhere convergent to a function f : 
R = C. If there exists a function g € £L'(R) such that | f,| < g almost everywhere 
foralln €N, then f € £'(R) and 


lim [lte- flax =o. 
n—>oo 
R 


In particular, 


[fe = lim Sn Ax. 
noo 
R 


R 


Proof It suffices to prove that (fy)n is an L'-Cauchy sequence. For each pair of 
natural numbers j < k, the function 


gj. = sup |fr— fs 


Jsrissk 


is integrable and verifies 0 < yj < 2g. Keeping j fixed, the sequence (9;,x)x is 

increasing and from Beppo-Levi’s Theorem we infer that this sequence is convergent 

in the L!-seminorm to a function gpE £'(R) and almost everywhere Qj = SUP j,k. 
k>j 


The sequence (@;) ; is almost everywhere decreasing and its limit is 0 at all points x 
where the the sequence (f;,(x))n is converging. According to Corollary 11.3.11, we 
infer that ||gp|| 1 > 0. 

Since | fi _ fil < ox for all i, 7 => k, we conclude that fi _ fila —> Oas 
i, j > 00, that is, (fy)n is an L'-Cauchy sequence. 


The conditions in Theorem 11.3.13 are “almost necessary” for the convergence 
fn — fll_1 — 0. Indeed, if this convergence takes place, then a subsequence of 
(fn)n must verify these conditions. By Theorem 11.3.9, we can pass to a subsequence 
(fen)) p to assure that fin) — jf almost everywhere. By replacing (fi(n))n with a 
subsequence if necessary, we may assume that | ten) — f | pi <2" for alln. Then 


[o,@) 

g=|fl+> | f - Ficny| is an integrable function that verifies | fein) < g for all 
n=0 

n. 


Both Beppo-Levi’s Theorem and Lebesgue’s Dominated Convergence Theorem 
can be used as integrability criteria. For example, in the first case, the statement 
must be rephrased as follows: If f is the pointwise limit of an almost everywhere 
increasing sequence of functions fy, € £'(R) such that 


neN 


sup / tn dx < o, 
R 
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then f € L' (R) and 
lim / lf — ful dx = 0. 
n— CO 
R 
In particular, 
[ra = im, | rae. 
R R 


See Exercises 3, 4 and 5 for applications. 
11.3.14. The Space L! (IR) The seminormed linear space £! (IR) can be turned 
into a Banach space by identifying the functions which differ only on Lebesgue null 
sets. For this, let us consider on £! (R) the following relation of equivalence: 

f ~ @ if and only if f = g almost everywhere. 
According to Lemma | 1.3.4 and Corollary 11.3.12, we know that 
| fll_1 = 0 if and only iff = 0 almost everywhere, 
and thus 
f ~ g if and only if || f — giz: = 0. 
One can prove easily that the quotient set 
L'(R) =L' (R)/~ 


is a normed linear space with respect to the algebraic operations and the quotient 
norm given by: 


f+G=ftg 
a-f=af 
If =f - 


By Theorem 1 1.3.9, we infer that L! (IR) is complete with respect to ||- | i. We call 
L! (R) the Banach space of all classes of Lebesgue integrable functions on R. 

An equivalence class may contain at most one continuous function; in fact, if 
f.g: R— C are continuous functions and f = g ae., then f = g. Consequently, 
there is an injective function 
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F:CRNL'R-LR), F(f=f, 


that allows us to identify C (R) N L! (R) with a linear subspace of L! (R). 


Exercises 


i 


Prove that a countable union of Lebesgue null sets is still a Lebesgue null set. 


2. Prove that the set £! (IR) of Lebesgue integrable functions f : R — C isa linear 


space with respect to the pointwise operations. Then, prove the linearity of the 
Lebesgue integral, 


[eor+eo aaa f fare f oa 
R R 


R 


for all f,g € £'(R) anda, B € C. 


. Use Beppo-Levi’s Theorem to prove that the positive function 
[o,@) 
F= Xi-Lu+ » 27" (X{=n,—n+1) a X(n—1,n]) 
n=2 


is integrable on R. 


. (A connection between series and integrals). Prove that a numerical series >)”, dn 


is absolutely convergent if and only if the function 


00 
f= = Qn * X[n,n+1) 


is Lebesgue integrable. Notice that the sum of the series equals the integral of /f. 


. Prove that every continuous function f : R — R such that 7 ee x? f(x) = Ois 
xX|— 00 


Lebesgue integrable. 
[Hint: Use Lebesgue’s Dominated Convergence Theorem. ] 


. (Fatou’s Lemma). Let (f;), be a sequence of nonnegative functions in L'(R) 


such that 


lim inf il Tn(x) dx < ow. 
n— Cc 
R 
Prove that the function lim inf f;,(x) also belongs to £! (IR) and 
n> oo 


/ lim inf f,(x) dx < lim inf / fr(x) dx. 
n— 0oO n—> oo 
R R 
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[Hint: For m arbitrarily fixed, the sequence of nonnegative integrable functions 


n= inf fj is pointwise decreasing to hy = inf f;, so by applying 
m<j<m-+n j=m ~ 


Beppo-Levi’s Theorem to the sequence (—g,,) we infer that the functions h,, are 
integrable. Since fp hm dx < fp fj dx for all j > m, we have 


[im dx < inf fa dx < limint f fj (x) dx. 
j>m jroo , 
R R R 


Then take into account that the sequence (y,) is increasing and its pointwise 
limit is lim inf f;.] 


11.4 Proof of Technical Lemmas 11.3.2 and 11.3.3 


Proof of Lemma 11.3.2. Since (fy)n is an L'-Cauchy sequence, for every natural 
number n € N, there exists a rank k, such that for all 7, / > kn, we have 


lf; — fills < 1/27. 


Choosing (kn )n strictly increasing, the subsequence gn, = fx, verifies the relation 


Ilgn+1 — Gnllz1 < 1/2" (11.5) 


for every n € N. We will prove that this subsequence has all properties stated in 
Lemma 11.3.2. 
Indeed, the series 


go(x) + >) Loerie) = ge(x)1, (11.6) 


k=0 


whose partial sums are the functions gy, (x), converges absolutely almost everywhere 
and this convergence is uniform except on a set of arbitrarily small measure. 
In order to prove this, let us consider the sets 


1 
Y, = {x > |9n%) — Gn41(X)| = =| , nen. 


Since |gn — gn+1| iS astep function, Y, is an elementary set. Integrating the inequality 
27" - xy, < |9n — Gn+i| and taking into account (11.5), we infer that A(Y,) < 27”. 
Put 


7,37, UYaa in. 


384 11 The Theory of Lebesgue Integral 


Then 1 
>; (Ye) = Qn-1~ 


k>n 


If x ¢ Z,,, then for every k > n, we have 


1 
l9k(x) — Oe+i(x)| < 5k 


and thus the series (11.6) is indeed absolutely and uniformly convergent on the 
complement of Z,,. This series is pointwise convergent on 


[o,@) 
and it is clear that () Z, is a Lebesgue null set. The proof ends by noticing that for 
n=0 
given ¢ > 0, one can choose as Z, every set Z, with 


Proof of Lemma 11.3.3. Put 


1 
gn-1 


<€&. 


hn =|fn — gn, neN. 


Then (hy) is an L'-Cauchy sequence of step functions that converges almost 
everywhere to 0. For the first property, notice that 


’ 


|. fi — 97| — fe — 9l| < | fi — Fel + |97 — 9 


which yields 
tj — hells S [Fi ~ filles + 97 — 9 - 


We have to prove that (h,), converges in the L!-seminorm to 0 and for this it 
suffices to show that a subsequence of it has this convergence property. 

Thus, by replacing (/,,), with a suitable subsequence, we may assume that (hy) pn 
verifies the uniform convergence property mentioned in Lemma 11.3.2. 

Let e > 0 be arbitrarily fixed. Since (hy), is an L!-Cauchy sequence, there exists 
a natural number AN such that |r; = an < e/5, whenever j,k => N. The set 
A = {x :hy(x) 4 0} belongs to €(R) and has finite measure. Clearly, for every 
n>=QN, 


E 


[taxes dx = [iin are < [im ~hyl dx <5. 
R R R 
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Put M = max {hy (x) : x € R}. According to the discussion above, there exist a 
sequence Ag, Aj, Az, ... of bounded open intervals with 


= € 
> 2) < =. 
k=0 ath at) 
and a natural number N’ > N such that for every n > N’ and every x in the 


[o,@) 
complementary set of |) Ax we have 
k=0 


é€ 


0 < ha(x) < ——_.. 
5(1+ A(A)) 


Put 
€ 
By, = 4x : hy(x) > ————__ } . 
| 51 an | 


The sets B, belong to €(R) and for n > N’ we have 


For n > N’, every compact set C € E(R) included in B, is covered by the open 


sets Ao, Aj, A2,... and thus, there must exist a natural number p = p(n, C) such 
that C C U A;. Then, 
k=0 
£ = g 
MC) < py e(Ag) < ps WN sip 


and the property of regularity assures us that 


oe 
51+ M) 


Every function , admits the decomposition 


hyn = hy: XByr + hy: XA\ By + hy: XCAnCBy, 
= (An — hn) + XBy, + AN: XBy + (tn — AN’): XA\ Byr 
+ Ayr XA\ Byr + hn > XCAnCByy? 


and for every n > N’ we have 
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E 
: n= tilxe de Sh ole 
R 


5 
i dx <M : . 
. , XS . ———_ < — 
NEY 5a+M) 5 
R 
E 
7 (An — hy’) > XA\ By, AX S [An — Ayi lat < 5 
R 
J: dure : nA 
re dx < ———____.. << 
NTE NB 5(1 + A(A)) 5 
R 
E 
[tin xan ey 4 Sf tin xe, de < 
R R 


Therefore || ||,1 < ¢ for every n > N’ and the proof is done. 


11.5 Integral of Measurable Functions 


A very basic notion in the theory of Lebesgue integral is that of measurability. 
It allows us to relate in a convenient way the class of Lebesgue integrable functions 
to other classes of usual functions already presented in the previous chapters. 


11.5.1 Definition A function f : R — C is called measurable if it is the almost 
everywhere limit of a sequence of step functions. 


A subset of R is measurable if its characteristic function is a measurable function. 

Clearly, every integrable function is also measurable, but the converse is not true. 
See the case of the identity of R. 

One can prove easily that the set M (IR), of all complex-valued measurable 
functions defined on R, is an algebra with respect to the pointwise operations. See 
Exercise |. 

The set M (IR, R), of all real-valued measurable functions defined on R, is a linear 
lattice of functions. For this, it suffices to show that | f| € M (R, R), whenever 
f € M(R, R). See Remark 1.2.9. The proof is based on two facts: (a) the absolute 
value of a step function is also a step function; (b) according to the property of 
continuity of the absolute value function, 


Fn(x) — f(x) almost everywhere implies | f,(x)| > | f(x)| almost everywhere. 


Every function which is equal to a measurable function except for a Lebesgue null 
set is itself measurable. Thus, one can speak of measurable functions f : A > C 
defined almost everywhere on R, and even of measurable functions f : A > R 
defined almost everywhere and finite almost everywhere. 
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The theory of complex-valued measurable functions can be reduced to that of 
real-valued measurable functions due to the fact that a function f is measurable if 
and only if both Re f and Im f are measurable. 


11.5.2 Comparison Theorem (Lebesgue’s Criterion of Integrability) A function 
f : R= C is Lebesgue integrable if and only if it is measurable and there is a 
Lebesgue integrable function g such that | f| < g. 


Proof The necessity is obvious. For the sufficiency, it is enough to consider only real 
functions. Since f is measurable, there is a sequence (fy), of step functions such 
that f, > f a.e. Let 


gn = min{max{ f,, —g}, g}. 


Since ey (R) is a linear lattice, the functions g, are integrable and |g,| < g for all n. 
Clearly, 


Gn > min{max{f, —g}, g} = f 


almost everywhere, so by the Dominated Convergence Theorem we conclude that f 
is integrable. 


If f : R ~ Cis a measurable function, which differs from 0 almost everywhere, 
then the function + is also measurable. 
A consequence of Lebesgue’s Criterion of Integrability is as follows: 


11.5.3. Measurable Limit Theorem /f (f,,),, is a sequence in M(R) and f, > f 
almost everywhere, then f € M(R). 


Proof Clearly, we may restrict ourselves to real-valued functions and choose a func- 
tion F € £!(R) such that F > 0. By Lebesgue’s Criterion of Integrability, it follows 
that all functions g, = f, F/ (1+ |fn|) are integrable. Since 


mee es 
Ae LF 


Jn? g 


almost everywhere, and |g,| < F foralln, we infer from the Dominated Convergence 
Theorem that g € £!(R). Since |g| < F,wecanexpress f as a product of measurable 
functions, f = g- Ee 


The set SNR), of all Lebesgue measurable sets, contains all Lebesgue null sets 
and also all intervals. If A is a bounded interval, then x 4 is a step function, while if 
the interval A is unbounded, then it is the union of an increasing sequence (A,,), of 
bounded intervals, whence x4 = im. XAn* 


The set 92(R) is an example of o-algebra on R. This fact is immediate if we take 
into account the formulas 
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xo = 0, xR = 1, Xta =1-xXa 


XU", Ak = max {XA, ‘ereny. XM At = min {xA, lek 2n}, 


XU pei Ak = Ba XU ear Aer XML Ak = A XN a1 Ak? 
and Theorem 11.5.3. 

In what follows, we will refer to 9N(R) as the o-algebra of Lebesgue measurable 
sets. It is the o-algebra generated by the intervals and the Lebesgue null sets. See 
Exercise 10. 

An example of a non-measurable set (based on Axiom of Choice) is presented in 
the section of Notes and Remarks at the end of this chapter. 

The following characterization of measurable functions suggests the existence 
of a certain similarity between the properties of measurability and continuity. This 
matter is clarified by a result due to Luzin, about the “near continuity” of measurable 
functions. See the Notes and Remarks at the end of this chapter. 


11.5.4 Theorem A function f : R — R is measurable if and only if it verifies one 
of the following equivalent conditions: 


(a) {x: f(x) > a} € WR) for everyae 
(b) {x: f(x) > a} €© N(R) for everya €] 
(c) {x: f(x) < a} € INR) for everya é! 


R: 
R 
(d) {x : f(x) < a} © MCR) foreverya eR. 


R: 


Proof The equivalence of the condition (a)-(d) is a consequence of the following 
equalities: 


[ee] 


1 
we: fey =a)=() |x: sey = a- 7] 


n=1 
{x + f@) <a} = Ce: f@)= al 

a 1 
{x foysa=[\fxisey<at;| 


{x : f(x) > a} =C{x: f(x) <a}. 


If f is measurable, then all the sets {x : f(x) > a} are measurable due to the fact 
that X{x: f(x)>a} is the pointwise limit of a sequence of measurable functions: 


‘ : 1 : 
Xx: f(x)>a} = jim n | min [7 at+ “| — min f, ai 5 


Conversely, if f verifies the conditions (b) and (d), then both functions f~ and 
f* verify the condition (d) because 
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{er 7 @)<a}= : f(x) = -a}. 
{v2 f'@) Sel ate : f(x) <a}, 


for every a > 0. This remark allows us to reduce ourselves to the case where f is 
positive. In this case, consider the sequence of measurable functions 


[2" f(x) ] 


mn EN. 


1 ‘oo 
fnr(®) = I DY xt feo>k2 ) = 
k=1 


The measurability of f, is a consequence of Theorem 11.5.3. Since 
2" f(x) —-1 < 2" f@)] = 2°f@) 


for all n and x, it follows that | f, — f| < aa for all n and thus, the measurability of 
f is assured by Theorem 11.5.3. 


According to Theorem 11.5.4, every continuous function is also measurable. The 
opposite implication is not true because there exist measurable functions (such as 
XqQ) which have no point of continuity. 


11.5.5 Remark The proof of Theorem 11.5.4 outlines the interesting fact that every 
positive measurable function is the pointwise limit of an increasing sequence of 
positive step functions. 


The whole theory of integration can be adapted mutatis mutandis to the case of 
functions defined on measurable sets. Thus, if f : A > C is a such a function, we 
say that f is measurable (respectively integrable ) if f xa, the extension of f to R 
which vanishes outside A, has this property. The integral of f is by definition 


[form | fase 


According to Lebesgue’s Criterion of Integrability, if f € £'(R) and A is a 
measurable set, then f x, is integrable. 

The set £!(A), of all Lebesgue integrable functions f : A — C, is a vector 
space with respect to pointwise operations, that can be identified with a subspace of 
£'(R) via the injective map f > f - xa. 

£'(A) displays similar properties as £! (IR). In particular, £'(A) is complete with 
respect to the L!-seminorm 


Mls = Hoe 
A 
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and all important results like the Absolute Value Integral Inequality, Beppo-Levi 
Theorem, Lebesgue’s Dominated Convergence Theorem and Lebesgue’s Criterion 
of Integrability remain true for functions in £'(A). 

The construction of the space L '(A) mimics that of the space L!(R) and we omit 
the details. 

The following result is a consequence of the Comparison Theorem 11.5.2, 
Lebesgue’s Dominated Convergence Theorem and Lemma 1 1.3.4. 


11.5.6 Theorem (The o-additivity of the integral) Suppose that f € L'(A) and 
CO 


A = \U Az is a decomposition of A into a countable family of pairwise disjoint 
n=0 
measurable sets (or, more generally, such that all intersections Am Ay, are Lebesgue 


null sets form # n). Then 


The following result outlines the most important class of measurable functions. 


11.5.7 Theorem /f A is a measurable subset of R and f : A — C is an almost 
everywhere continuous function, then f is also a measurable function. 


Proof Without loss of generality, we may assume that f is real-valued. 
By our hypothesis, the characteristic function x, is the almost everywhere limit 
of a sequence (@,), of step functions. By replacing each function ¢, by the function 


_ [0 ifgn (x) <1/2 
Le ae i if g(x) > 1/2, 


we infer that x4 is the almost everywhere limit of a sequence of characteristic func- 
m(n) 

where each set A, isa finite union (J) Ap; of mutually disjoint intervals. 
j=0 

Dividing these intervals if necessary, we may assume that ((A,;) < 27" for all 

J €{0,...,m(n)} andn eN. 

The points of A, where f - x4 is not continuous, or where the sequence (x4, )n 
is not convergent to x 4, constitute a Lebesgue null set X. Moreover, every point in 
A\ X belongs to a unique interval A,; for n sufficiently large while every point in 
CA \ X does not belong to any interval A, ; for n sufficiently large. 

m(n) 

Consider now the sequence of step functions f, = 2. Cnj XAnj> where 

j= 


tions x4 


n? 


0 ifAy~NA=D 
finj) if AnjN AFD 


Cnj = 


and x,j; is arbitrarily chosen in Ay; M A. 
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OnCA \ X, all functions f;, vanish, so that fir(x) > (f- xa) (x) ifx € CA \ X. 
If x € A\ X, then x is a point of continuity for f and the discussion above shows 
that x,; — x asn — oo. Therefore, f,(x) > (f - xa) (x) also for x € A\ X. In 
conclusion, f - x4 is the almost everywhere limit of the sequence of step functions 
Jn and thus, measurable, according to Theorem 11.5.3. 


Every Riemann integrable function f : [a,b] > R is also Lebesgue integrable 
and its Lebesgue integral is the same with its Riemann integral. See Exercise 2. 

Consider now the case of continuous functions f : R — C having compact 
support. The support of f is defined as the closed set 


supp f = {x : f(x) 4 O}, 


so that, if the support is compact, the function vanishes outside a compact interval. 
The set C; (R) of all continuous functions f : R — C having compact support 
is a linear subspace of £'(R). Moreover, taking into account the additivity of the 
integral, the Lebesgue integral of every function f € C, (IR) reduces to a Riemann 
integral, more precisely, 


b 


[fan f fas 


a 


whenever [a, b] is a compact interval that includes supp /. 


11.5.8 Theorem (The density of C, (R) in £!(R)) For every f € £'(R) and every 
€ > 0, there exists f; € Ce (UR) such that 


If — felln <€. 


Proof By Lemma 11.3.8, the space St(R) of all step functions is dense in L! (R). 
Therefore it suffices to prove the theorem in the particular case where f is the 
characteristic function of a compact interval, say [a, b]. For ¢ € ( . oa) , let fe be 
the piecewise linear function defined as follows 


0) if x € (—o, a] 
(x—a)/e ifx €la,at+e] 
fe(x) = 1 ifx €la+e,b—e] 
(b—x)/e ifx €[b—e,b] 
0 if x € [b, oo). 


See Fig. 11.1. 


392 11 The Theory of Lebesgue Integral 


a ate bie b 


Fig. 11.1 The graph of the function f,(x) 


Then f, € C, (R) and 


ate b 
x-—a b-x 
lf — fella = | |xtao) — fe] dx = 1—-—— ) dx + 1— —— ) dx 
R a b-e« 
Se eee 
=5+5=6 


The argument of Theorem 11.5.8 shows that for every open subset U of R, the 
space C,(U) (of all continuous functions f : U — C having compact support 
included in U), is dense in £!(U). 

One can prove a stronger result, the density of the space 


C2 (U) ={f € Cc. (U) : f of class C*} 


in £'(U). The proof is similar to that of Theorem 11.5.8, by considering instead of 
the function f,, the function given by Lemma 11.5.9 in the case where K = [a, b] 
and U = (a—«,b+e). 


11.5.9 Urysohn Lemma (C®% version) Let K be a compact subset of R and let U 
be an open subset of R such thatK Cc U. Then there exists a function f inC™(R, R), 
with compact support, that verifies the following properties: 


O0<f<l, f@=lonkK and f (x) =O on CU. 


Proof According to Exercise 2, Sect.8.5, for every pair of real numbers a < b, the 
function 


el/@-ae—h) if x € (a, b) 


Sab) = 0 ifx € R\ (a,b) 


is of class C®. Notice that supp fo. = [a, b]. 
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Let e > 0 anda > 0. Then the function g : R — R defined by the formula 


g(x) = fois (—) 
E 


is of class C°, positive on the interval (a — €,a +e), and null on R\(a—¢,a+ 6). 

Since K is compact, there exists a function G € C® (R) (obtained as a finite sum 
of such functions g), that verifies the conditions G(x) > 0 on K and G(x) = 0 on 
the complementary of a closed set included in U. Put 6 = inf{G(x) : x € K}. Then 
the function 


Jo. fo.s(t) dt 


AG) = 
i fo,s(t) dt 


is of class C°°, takes values in [0, 1], h(x) = 0 for x < O and h(x) = 1 forx > 6. 
The proof ends by noticing that f = ho G fulfills all requirements of Urysohn 
Lemma. 


A very general technique concerning the approximation of integrable functions 
by infinitely differentiable functions with compact support will be presented at the 
end of Sect. 11.7. 

Using the Beppo-Levi Theorem, one can prove easily that every function which 
is absolutely integrable in the Riemann improper sense is also Lebesgue integrable 
(the value of the integral being the same). In particular, 


1 lee) 


1 1 1 1 
[oe- ifp <1 and Joe- ifp>1, 
xP l= p xP p-1 
0 1 


as Lebesgue integrals. 

The Lebesgue integral is an absolute integral in the sense that if a function is 
integrable, then its absolute value is integrable too. Therefore, the functions which 
are integrable in the Riemann improper sense but not absolutely integrable escape 
the theory of the Lebesgue integral. 


Exercises 


1. Prove that the product of two measurable functions is a measurable function. 
2. Let f : [a,b] — R be a Riemann integrable function. Prove that f is also 
Lebesgue integrable and the value of its integral is the same. 
[Hint: Use Lebesgue’s Dominated Convergence Theorem, by taking into account 
the formula 


b n—-1| 
/ f(x) dx = lim, = a flatk 2-4) 
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. Prove that a function f : R > R is measurable if and only if the inverse image 


of every Borel set is a measurable set. 


. Prove that the function 


_ Jjl/x ifx €R\ {0} 
I=) G. if 4 =6 


is measurable, but there is no continuous function g : R — R such that f = g 
almost everywhere. 


. Let f : R — R be a function for which there exists a sequence (fy) of 


measurable functions such that 


f(x) = limsup f,(x) almost everywhere. 
n—> © 


Prove that f is a measurable function. 


. Let (fn)n be a sequence of measurable functions on R. Prove that the set A of 


all points x for which there exists the limit lim f(x) is measurable. 
n—-> Cc 
[Hint: Notice that 


A=(VUN [sien — fol <x a 


n=0m=0i,j =m 


. Prove that the derivative of every differentiable function f : R — C is measur- 


able. 


. Suppose that f : R — Risameasurable function and g : R — Risacontinuous 


function. Prove that the function g o f is measurable. Show, by an example, that 
the composition of two measurable functions may be not measurable. 


. (The Principle of Cavalieri). If f € L'(R) isa positive function, show that 


[foes fade: fey > t}) dt. 
0 0 


(Hint: Approximate the left hand side integral by using the sequence (f;,), that 
appears in the proof of Theorem 11.5.4.] 
(Approximation of measurable sets by Borel sets) 

(a) Show that the characteristic function of any compact interval is the pointwise 
limit of a sequence of continuous functions with compact supports. 

(b) Infer that the characteristic function of every measurable set A is the almost 
everywhere limit of a sequence of continuous functions f,, with compact 
supports. 

(c) Let A and (f;,), as above and denote by X the Lebesgue null set on which 
the sequence (f,)n is not convergent to x4. Prove that the function 
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11. 


12. 


13. 


14. 


_ | xaA(x) ifx e R\X 
FO 1 oa. ifxe X 
is lower semicontinuous on R. 
(d) Infer from the lower semicontinuity of the function f that for every measur- 
able set A, there exist a sequence (U,,), of open sets and a Lebesgue null set 


CO 
Y such that AUY = () Uy. 
n=0 


(e) Conclude that for every measurable set A, there exist a sequence (Cy), of 


CO 
closed sets and a Lebesgue null set Z such that A= { LU C ») WZ. 

n=0 
(The averaging property of integral). Let f : R — R be an integrable function 
and let C be a closed subset such that for all measurable subsets A, with 0 < 
A(A) < o, the averages a ty f dx belong to C. Prove that f(x) € C for 
almost all x. 
(Jensen’s Inequality). Suppose that g is a real-valued integrable function defined 
on a measurable set A with 0 < A(A) < o, and f is a convex function defined 
on an interval containing the range of g. Prove the inequality 


1 1 
f wp | gener < ap | tower. 
A A 


(M. Kac). Letg: R-> S$ ' be a measurable function that verifies the functional 
equation 
g(x + y) = g(x)gQ) for all x, y ER. 


Prove that there is a real number a such that g(x) = e!®* for all x € R. 
[Hint: Choose a,b € R such that vi g(t) dt ~ 0. Integrating the functional 
equation with respect to y, we get 


b 
Jee g(t) de 


g(x) = =; : 
Jy g(t) dt 

By the dependence of parameter theorems it follows that g is continuous (and 
then that it is differentiable). Then differentiate the functional equation with 
respect to y and conclude that g verifies the equation g’(x) = g'(0)g(x).] 
(Kac’s solution to Cauchy functional equation). Prove that the only measurable 
functions f : R — R satisfying Cauchy’s functional equation are those of the 
form f(x) = ax. 
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11.6 The Lebesgue Measure 


The Lebesgue measure is the function 2 : CR) — [0, co] defined by 


7 xadx if x, is integrable 
(A) =4R 


ee) otherwise. 


Obviously, the Lebesgue measure extends the interval length. It has a number of 
striking properties, briefly recalled here. 

First, the Lebesgue measure is o-additive: if (Ayn), is a sequence of mutually 
disjoint measurable sets, then 


( U a») =>" 2(A,), 
n=0 n=0 


See Exercise 2 for details. 
Next, A is monotone, 


ACB implies A(A) < ACB), 
and subtractive, 
ACB implies 1(B\A) =A(B) —A(A). 


The above properties of o-additivity and monotonicity imply that 2 is also 


o -subadditive, 
CO Co 
( 4») < 5 (An) 
0 


n= n=0 


whenever (A,,)n is a sequence of measurable sets. 
The Lebesgue measure is translation invariant in the sense that 


(A) = (A 4X) 


for every measurable set A and every real number x; here A+x = {a+x:aeA}. 
The proof is a direct consequence of Proposition 11.3.6. 
Other three important properties of the Lebesgue measure are presented below. 


11.6.1 Theorem (Property of continuity) 


(a) If (An)n is a decreasing sequence of measurable sets of finite measure such that 


CO 
() An = 9, then lim (An) > 0. 
n=0 n—-> oo 
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(b) Tf (An)n is an increasing sequence of measurable sets of finite measure, then 
[o,@) 
jim A(An) = ( U ) 
n= 


Proof Both assertions are consequences of the properties of o-additivity and sub- 
tractivity. In case (a), 


lo) N 
AMAo) = D7 ACAn\ Angi) = Him 7 A(An \ Anti) 
n=0 n=0 


lim [A(Ao) — A(An41)], 


whence lim A(A,) — 0. In case (b), 
noo 


(U 4») = (Ao) + 2 A(An+1 \ An) 


n=0 n=0 
N 
A(Ao) + im 2 [Anti ) = ACAn)I 
et 


lim A(Ay+41). 
N—>co 


11.6.2 Theorem (Property of regularity) For every measurable set A, 


(A) = sup{A(K) : K compact, K C A} 
= inf{A(D) : D open, D D A}. 


Proof According to Remark 11.5.5, the characteristic function of A is the pointwise 
limit of an increasing sequence (fn)n of positive step functions. The sets 


An = {x fn(x) > 1/2} 
are finite unions of bounded intervals included in A and the sequence (x4,,)n 1S 
increasing and pointwise convergent to x4. 
If A(A) < oo, from Beppo-Levi’s Theorem we infer that 
A(A) = sup {A(An) : 1 € N}, 


whence 


A(A) = sup{A(K) : K compact, K C A}. 
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In the general case, we use the property of continuity of Lebesgue measure, which 
shows that 


(A) = sup {A(AN [—n, n]) in > I}. 


The second equality in the statement follows from the first one, applied 
toCA. 


11.6.3 Theorem (Property of absolute continuity) Let f € £'(R) then for every 
€ > 0, there is 6 > 0 such that 


A € SCR) and d(A) < 6 implies / | fl dx <e. 
A 


Proof Let f, = min{|f|,}. According to the Dominated Convergence Theorem, 
it follows that fy; — |f| in £'(R). Therefore, for ¢ > 0 arbitrarily fixed, there is a 
natural number WN such that || |f| — fall < ¢/2 for alln > N. Let 6 = ¢/2N. If 
A € SNR), with A(A) < 46, then 


fis av= fai —faydet f fvd <5 4K A)-N <e. 
A A 


A 


The last theorem allows us to associate to each positive function f € L'(R), 
whose integral equals 1, a probability measure defined by the formula 


nya) f far, Ae me) 
A 


in this context, f is called the density of probability. The o-additivity of jz ¢ makes 
the objective of Exercise 5. 

The most known example in Probability Theory is the density of normal distrib- 
ution of mean m and standard deviation o : 


f(x; m,o) = ee 0”, 
oO 


V2 
See Appendix C, Sect. 1.2. 
Exercises 


1. Show that a Lebesgue null set is exactly a measurable set which has measure 0. 
2. (a) Prove the property of o-additivity of Lebesgue measure, 


(U a») = > A(An) 
n=0 


n=0 
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for every sequence (A,,), of mutually disjoint measurable sets. 

(b) Show that the property of o-additivity still works for sequences of almost 
disjoint measurable sets (that is, such that A,, 1 A, is a Lebesgue null set 
whenever m # n). 

[Hint: The nontrivial case is when A(A,,) < oo for every n. Apply Beppo-Levi’s 
Theorem to the sequence of functions Mio As (which is pointwise convergent 
tO XJ Ak ), by taking into account that 


n 
XUfg Ae = 2 XAu 
k=0 


3. Prove the property of regularity of Lebesgue measure. 

4. (Kolmogorov’s criterion of o-additivity). Let & be a o-algebra of subsets of 
a nonempty set Q and let % : X& — [0, oo) be an additive measure such that 
(im. L(A;,) = 0 whenever (A;,)n is a decreasing sequence of sets of & with 


empty intersection. Prove that jz is o-additive. 
5. Infer from Kolmogorov’s criterion of o-additivity and the property of absolute 
continuity of Lebesgue measure that every measure of the form 


wyAy= f fax. Ae INR), 
A 


associated to a positive function f € £'(R), is c-additive. 
6. Let A be a measurable subset of R and f € £!(A). Prove that 


[irc dx <4-sup | feo > BeEMR), BCA 
A B 


7. (Markov’s inequality for Lebesgue measure). Suppose that f € cs (R) and A 
is a measurable subset of R. Prove that for every ¢ > 0, we have 


1 
A(x € A: |f(@)| =e) < ~ fifo) dx. 
A 


8. (Adiscrepancy between series and integrals). (a) Itis known thatifa series )°,, dn 
is convergent, then a, —> 0. If f € £'((0, 00)), is it true that lim f(x) = 0? 
xXx—>0O 


Give an example. 
(b) Suppose that f € L! ((0, oo)). Prove that lim f(x) = 0 in density, that is, 
Xx—0o 


for every ¢ > 0, we have 
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i A(x € [0,r]:|f@)| 2 €}) _ 
im = 


roo r 


0. 


9. Prove that SCR) and P(R) are cardinal equivalent. 
[Hint: Use the Cantor—Schréder—Bernstein Theorem and the fact that Cantor’s 
triadic set has measure 0 and is cardinal equivalent to R.] 
10. Prove that there is no function m : P(R) — [0, co] such that m([0, 1]) = 1, m 
is countable additive and m is translation invariant. 


11.7 Dependence on Parameters 


The following result gives us a rather general condition under which the limit and 
the integral can be used interchangeably. 


11.7.1 Theorem (Continuous dependence on parameters) Let A be a measurable 
subset of R, Qametric space and f: Ax Q—>C, f = f(*, y), afunction that 
verifies the following three conditions: 


(a) for each y € Q, the function x —> f (x, y) is integrable on A; 

(b) the function y > f (x, y) is continuous on Q for almost every x € A; 

(c) there is a function h € L!(A) such that | f(x, y)| < h() for every y € Q, and 
almost every x € A. 


Then the function 


FQ) = / FO, 9) de 
A 


is continuous on QQ. 


Proof Let (yn)n be a sequence converging to y in Q. To it, we attach the functions 
Gn (x) = f(X, yn) and g(x) = f(x, y). According to our hypotheses, g, — g 
a.e. and |gn (x)| < h(x) almost everywhere. By the Dominated Convergence The- 
orem, it follows that [, gxdx > |, gdx, that is, F(yx) > F(y), and the proof is 
done. 


We next discuss the problem of interchanging integral and derivative. 


11.7.2 Theorem (Differentiating under the integral sign) Let A and B be two non- 
degenerate intervals and f : Ax B > C, f = f(x, t), a function such that: 


(a) for every x € A, the function f (x, -) is Lebesgue integrable on B; 

(b) for every t € B, the function f(-,t) is C! on A; 

(c) there exists a function g € L!(B) such that he (x, t)| < g(t) for everyx € A 
and everyt € B. 
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Then, 


F(x) = f(x, t) dt 


B 


is a C! function on A and 


Fay= f fia.nar for everyx € A. 


Here f/(x, t) denotes the derivative of f(x, t) with respect to x. 


Proof Due to the linearity of both the integral and the derivative, it suffices to consider 
the case where f is real-valued. 

Let (x;,)n be a sequence that is convergent to x in A. By Lagrange’s Mean Value 
Theorem, the functions t > Ln NA LOD verify the condition 
fn, t) — Ff, 1) 


Xn —X 


< sup | fi(u, t)| < g(t) for every t € B 
ucA 


and thus, they belong to £!(B) (according to Lebesgue’s Criterion of Integrability 
11.5.2). Since 


im Jem FG) _ 
im = 


n—>0o Xn —X 


Piet) 


we can apply the Dominated Convergence Theorem in order to get the integrability 
of the function t > f(x, ft) and the fact that 


in we oe ere Le) a= [ fia, 1) de. 


n—->0o Xn —X 


Since (x;)n was an arbitrarily fixed convergent sequence, the above reasoning shows 
that the function F(x) is differentiable and its derivative is given by the formula in 
the statement. 

The continuity of F’(x) follows from Theorem 11.7.1. 

The next result is concerned with the permutation of integrals. In order to avoid 
some technicalities, we consider here the particular case of continuous functions. 


11.7.3. Fubini-Tonelli Theorem (for noncompact intervals) Let A and B two inter- 
vals and f: Ax B > C, f = f(x, t), acontinuous function such that at least one 
of the following two iterated integrals 
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f) fitco dt ax and [ [iteoias dt (11.7) 
B B A 


A 


exists in the sense of Lebesgue integrability. Then, both iterated integrals 


/ [fone ax and [ | fenas dt (11.8) 
A B B A 


exist and they are equal to each other. 


Here the existence of an iterated integral like [ A ( J, pg lt @,t)| dr) dx means that 
the functions t > | f(x, f)| are integrable on B for every x € A, and the resulting 
integrals provide a function x > [. glf (x, t)| dt integrable on A. 


Proof The case where both intervals are compact made the object of Theorem 9.4.4. 
We will detail here the case where A = B = [0, 00) and the first iterated integral, 


/ Jire.o) dt} dx, (11.9) 
0 0 


exists in the Lebesgue sense. All other cases can be treated in the same manner. The 
existence of the integral (11.9) assures the existence of the corresponding integral for 
f replaced by each of the functions (Re f)*, (Re f)~ , (Im f)* and (Im f)~. 

Taking into account the linearity of the integral and the fact that every function 
h € L' (R"”) can be written as a linear combination of positive functions, 


h=Reh+i Imh = (Re h)* — (Reh) +i (Imh)* —i (mh), 


we can reduce the equality of the integrals (11.8) to the case where f is positive. 
According to Theorem 9.4.4 and Beppo-Levi’s Theorem, for every number a > 0, 


e.e) 


i; [fener ax> | [fone dx 
0 \o o \o 


B a a B 
= lim [fone dx = tim, f | fosnas dt 
B> oo Boo 
0 0 0 0 
a 


=) | fo.nas dt. 
0 


0 
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A new application of Beppo-Levi’s Theorem yields the existence of the second iterated 
integral and also the fact that 


/ | fone av> | | fo.nas dt. 
0 \o 0 \O 


Now we can change the role of the two integrals, resulting their equality. 


The above theorems on the dependence of Lebesgue integral on parameters have 
important applications in mathematics, probability and statistics, physics, mechanics, 
etc. 

We end this section by discussing an important technique of approximation of 
integrable functions by smooth functions. The key notion is that of Dirac sequence. 


11.7.4 Definition A Dirac sequence is any sequence of functions w, which verifies 
the following three properties: 


(DIR1) The functions w, are continuous, with compact support and positive. 

(DIR2) fp @n dx = 1 for everyn > 1. 

(DIR3) For all e > 0 and é > 0, there exists a natural number N € N such that for 
alln > N, we have 


—6 lee) 


J enc de [ono ar <e. 


—oo 5 


The meaning of condition (DIR3) is that the functions w, concentrate around the Oy 
axis if n is sufficiently large. 
The simplest way to produce a Dirac sequence is to start with a positive function 


w € C;(R, R) such that 
ic (x) dx = 1. 


R 
Then it is easily seen that the sequence 
On (x) =nw(nx), neN, 
is a Dirac one. An example of such a function is the so called test function, 


cee V/U-x) if |x| <1 


i 0 if |x| >1, 


(11.10) 


where the constant c assures that the integral of w is 1. This is an even function of 
class C™. See Sect. 8.5, Exercise 2. 


404 11 The Theory of Lebesgue Integral 


The smoothing process via Dirac sequences is realized by an operation called 
convolution product, formally defined by the formula 


(f *9) @) = ff ~ yo(y)dy. (11.11) 
R 


The fact that the convolution product is well defined will be assured by the 
particular framework in which it will be used: f an integrable function and g a 
continuous function, with compact support. See Lebesgue’s Criterion of Integrabil- 
ity 11.5.2. 


11.7.5 Remark It is worth noticing that the convolution product makes sense for all 
pairs of integrable functions f, g € £! (R) and defines a function that also belongs 
to £! (R) . Moreover, 


If * olor SWF lle: lglle - 


The proof is based on a more general version of Fubini’s Theorem, concerning 
integrable functions on IR?. See Rudin [1], Theorem 8.14. 


The product of convolution is commutative, due to the invariance of Lebesgue 
integral under translation: 


Seg=oe fh. 
See Proposition 11.3.6. 


11.7.6 Theorem /f f € £!(R) and (wp)y is the Dirac sequence associated to the 
test function (11.10), then: 


(a) all functions w, * f are infinitely differentiable; 
(b) if f vanishes outside a compact set K, then 


1 
supp @, * f C [x : 40s, 6) < AE 
n 


(C) @, * f — f uniformly on every compact set K on which f is continuous. In 
particular, wo, * f — f pointwise onR. 


Proof The assertion (a) follows from Theorem 11.7.2. For the assertion (b), notice 
that 


(ons No) = f fo =yenyay =n [fee yyorny) dy 
R R 


= [ome — z)) f(z) dz. 


K 
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The last integral is 0 whenever n |x — z| > 1, in particular, for every x such that 
d(x, K) > 1/n. Therefore the support of w, * f is included in the compact set 
{x :d(x, K)< +}. 

(c) By condition (DIR2), and the change of variable z = ny, 


(@n * f) c= s0)= [ fe-yenoydy-f00 f onyay 
R R 


= [ire — y)— f @lon(y) dy 
R 


= n [ire ~ y) — f (x)]w(ny) dy 
R 
1 


= / Lf —z/n) — f (lo) dz 


-1 


for allx € R. 
Since f is continuous on K and K is a compact set, then for every ¢ > 0, there 
exists 6 > O such that 


ly| <6 implies |f(x—y)— f(x)|<e forallx € K; 


this can be argued by reductio ad absurdum. 
Then for all x € K andn > 1+ [1/6], we have 


1 
a—o=Fai< / if @ —z/n) — f @l ole) dz 
= 


1 


se f az =€é. 


-1 


Theorem | 1.7.6 still works for an arbitrary Dirac sequence, but in this case the con- 
clusion (a) asserts only the continuity of the functions w, * f. If we apply this remark 
to the Landau sequence, 


_ fen(l—x?)" ifx e[-1, 1] 
a= | 0 if |x| > 1, 


(where the constants c, are chosen such that the ie Ly(x) dx = 1), we obtain the 
density of polynomial functions in the space C([—1, 1], R). See Exercise 8. 
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Exercises 


1. Compute F(t) = i cos (tx) en [2 dx by noticing that F’(t) = —t F(t). 
2. (L. Euler). Prove that 


1 
i log (x? — 2x cosa + 1) ae (x — a)” in x 
x 2 6 
0 
for all a € [0, 27]. 
[Hint: Let f(a) be the value of the integral in the left hand side. This gives us a 
C? function on [0, 277] that satisfies the functional equation f’(a/2) + f"(a — 
a/2) = 2 f"(a). Conclude that f” is constant.] 
3. (Fresnel’s Integrals). Consider the function f(x, y) = e~*” : sin y, for (x, y) € 
[0, co) x [0, 00). Prove that: 
(a) fo’ Uo” FG, y) dy) dx = fo ayy dx = 335: 
Of; Us f@.y)ds) dy v/a f° sine? di: 
(c) i sin ¢? dt = he cos t? dt = 5 JF. 
4. Given f € C(R), we attach to it the sequence of functions 


J" f(t) dt forn > 1. 


(x -t 
nl 


Fwy = f soar F(x) = f 
0 0 


(a) Prove that F, = F,_1, F\ (0) = 0 ifk <nand F\"*) = ff. 
(b) Infer the formula 


x Xn-1 


Xx] x 
/ / ee / f Xp) dxXy_... | dxg an =F fw -o"seoar, 
n! 
0 0 


0 0 


and conclude that F,, is the unique solution of the following differential problem 
with initial conditions 


yard =f 
y(0) =0 fork =0,1,...,n. 
5. (a) Prove that the function 


oo . 
t 
F(x) = / ak dt 


0 


is continuous on (0, oo) and lim F(x) = 0. 
Xx—>0O 
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(b) Show that it is possible to differentiate under the integral sign for x > 0 and 
infer that F(x) = — arctan x + C (for some constant C). 

(c) Taking the limit of F(x) as x — oo, conclude that C = 7/2. 


Note Taking into account the convergence of the improper integral Ve a aint dt, we 
can extend F(x) at x = 0 by putting 


sin t 
0 
By the convergence of this integral, 
n 
sin t 


as m,n — oo. The functions 


n 
sin t 
F(x) = = dt, x €[0, 0), n € N* 
0 


are continuous on [0, 00) and F;, — F pointwise. The convergence is uniform since 


n n 
sint _,, sin t 
|Fn(x) — Fin (x) = hes dt) < a dt 


m m 


by the Mean Value Theorem for integrals. Therefore, the function F is continuous 
at 0 and so, we conclude that 


CO CO 
= dt = tim im [ Ot e “dt 
0 0 
: a a 
= im. (- arctan x + 5) = = 


6. (L. Euler). Using the Maclaurin expansion for the sine function and the fact that 
{i (log x)” dx = n! for even n, prove that 


1 


/ sin(log x) a4 
dx = —. 
log x 2 
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(Cauchy-Frullani Integral [2]). Let f : (0,00) ~ RbeaC ' function such that 

the following two limits exist in R: f(0) = ee f(x) and f(oo) = lim f(x). 
x= x—>0O 

Then 


i Flax) — FO” |. = (f(oo) — F(0)) log = for alla, b > 0. 
x 
0 


. (An alternative approach of Weierstrass Approximation Theorem). Consider the 


Landau sequence, 


é(l=x?)" ifxe[—1, 1] 
Ente) =o" 0 if |x| > 1, 
where the constants c, are chosen such that Hae , n(x) dx = 1. Prove that: 
(a)O0 < cy, < (n+ 1)/2; 
(b) the Landau sequence is a Dirac sequence; 
(c) if f is a continuous function on R, whose support is included in [—1, 1], 
then L,, « f is a sequence of polynomials converging uniformly to f on [—1, 1]; 
(d) every continuous function on [—1, 1] is the sum between an affine function 
A+ Bt anda continuous function g which vanishes at the endpoints. Then, using 
the above assertions, infer the Weierstrass Approximation Theorem. 


11.8 The Lebesgue Spaces CL? (A) and L?(A) 


In practice, it is often useful to consider other spaces of measurable functions than 
L! (A) and L! (A). In what follows, we will discuss the case of Lebesgue spaces of 
exponent p € (1, 00), over a measurable subset A of R : 


LY (A)={f:A—>C: f is measurable and | f|” is integrable} 


Since 


If(x) + g@)|? < 2°71 ([F@)I? + 1F@)I?) for every x, 


one can prove easily (using Lebesgue’s Criterion of Integrability 11.5.2) that £? (A) 
is a linear space with respect to the pointwise operations. See Exercise |.This space 
is endowed with the seminorm of index L? , 


1/p 


iFin= fire 
A 
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The fact that ||-||;» verifies the triangle inequality (SN3) is shown in 
Corollary 11.8.2 below. 


11.8.1 The Rogers-H6lder Inequality for Lebesgue Spaces Suppose that p,q € 
(1, co) and ; + ; = |. Then the product of every pair of functions f € £L? (A) and 


g € £4 (A) belongs to L' (A) and 


1/p I/q 


/ Fx) g(x) dx| < / noe / Ig@xI4 dx 
A A A 


The equality occurs if and only if either f = 0 ae. or g = Af a.e. for some 
complex number i. 


The particular case where p = q = 2 is known as the Cauchy—Buniakovski-Schwarz 
Inequality. 


Proof If [,|f(x)|? dx = 0 or f, |g(x)|4 dx = 0, then f and respectively g, is 0 
a.e. See Corollary 11.3.12. In this case, the inequality in the statement is trivial. If 
both integrals are different from zero, we apply Young’s Inequality 


Pp q 
eee ee 
q 


’ 


for 


7 __|f@I a ge — 
( lg(xpl9 es) 
A 


1/p 
( If)? es) 
A 


lf (x)g(x)| -_lf@l é lg(x)I4 
iP Va ~ pf \f(ylPdx — qf lg()|¢ dx’ 
( rena) ( tsar) A A 
A A 


whence, by using Lebesgue’s Criterion of Integrability, we infer that fg ¢ £L' (A). 
Then, by integrating both sides over A, we obtain the inequality in the statement. 
The equality case follows easily from the observation that equality occurs in Young’s 
inequality if and only ifa@? = 67. 


We obtain 


11.8.2 Corollary (Minkowski’s Inequality) Suppose p > 1. For every pair of func- 
tions f,g € L? (A), 
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1/p 1/p 


[tea + gcoiras = [ircoiras + [igcoiras 
A A A 


The equality occurs if and only if either f is 0 almost everywhere, or g = Af almost 
everywhere for a suitable nonnegative number i. 


Proof The case p = | is clear. For p > 1, put g = p/(p — 1) and infer from the 
Rogers-H6lder Inequality that 


fire g(x)|? dx 


s [ues g(x)|? tones fers g(x)?" |g@x)| dx 


I/p 


/ If (x) + g(x) P7P4 dx l lf (x)|? dx 
A 


1/p 


} Lf c) + g(x)|P—P4 dx “(y g(x) |? dx 
A A 
1/q I/p I/p 


[irco+ gx)|? dx [ircoiras + [igcorras 
A A A 


which is equivalent to Minkowski’s inequality. The equality case constitutes the 
object of Exercise 5. 


We call £? (A) the space of Lebesgue p-integrable functions on A. In the special 
case where p = 2, we speak of square-integrable functions. 

The seminorm of index L? is connected to a special function (called Hermitian 
product), which is defined on £7(A) x £7(A) by the formula 


aoe =f fox. 
A 


According to the Cauchy—Buniakovski-Schwarz Inequality, 


|(f 9)22| SWF lle glee - 


The Hermitian product is continuous: if || fn — fll2 > 0 and |lgn > gllz2 > 90, 
then (fn, Gn) 7,2 — (f, g),2- Indeed, by the aforementioned inequality, 
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| (fs Gn) 12 —(f, 9) 12| = (fn — fi gn)r2 + (Ff, 9n — 9) 22 
=lia— Jil ae ge llz2 + Ign — gliz2 IFllz2 > 0 
ke 


asn > ©. 


11.8.3 Theorem Let (fi) be a Cauchy sequence in L? (A). Then: 

(a) (fn)n contains a subsequence that converges almost everywhere and has the 
property that for every ¢ > 0, there exists a measurable set X with (X) < €, such 
that this subsequence is uniformly convergent on CX. 

(b)There exists f € LP (A)such that || fn — f \l_» > 9. 


Proof (a) Without loss of generality, we may assume (replacing (f,)n by a subse- 
quence if necessary) that 


1 
Ife — fallin < sam 
fork > n. Let Yn = {x : | fa4i(x) — fa(x)|? = ar} forn € N. As in the proof of 


= oF 
Lemma | 1.3.2, one can show that A(Y,,) < 1 Put 


an 
) ene A) ene eee 


Then, for all x ¢ Z, andk > n, we have 
| <a 
Se+iQx) — frQ)lh < a 


[o,@) 

whence it follows that the series fo(x) + >> (fn4i(x) — fn(x)) is absolutely and 
n=0 

uniformly convergent on the complement of Z,, (and thus, absolutely and pointwise 


CO 

convergent on the complement of Z = () Z,). The proof of assertion (a) ends by 
n=0 

noticing that Z is a Lebesgue null set. 


(b) We will show that the function f, defined by 


F(x) = folx) + D2 Ine @) — fn) = lim fnlx) for.x € A\Z, 


n=0 


and f(x) = 0on Z, belongs to £? (A) and || f, — fll» — 0. Indeed, for n arbitrarily 
fixed, we infer from Fatou’s Lemma (applied to the sequence (| fi; — fn|)x) that 
St — fn € £?(A) and 


ao 1 
I = fully = ff ful? dx stimint [fi — ful? dx = e. 
A A 
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The proof ends by noticing that f = (f — fn) + fn € £?(A), due to the fact that 
L£? (A) is a linear space. O 


11.8.4 Corollary Let (fn)n be a Cauchy sequence in L? (A) which is almost every- 
where convergent to a function f : A —> C.Then f € LP (A) and|| f, — f\ltv > 9. 


Anice feature of the spaces £? (IR) with p € [1, oo), is the density of the subspaces 
St(R), C.(R) and C&°(R). 


11.8.5 Theorem /f p € (1, 00), then every function f in L? (IR) is approximable 
by step functions and also by functions in C,(R) and C2°(R). 


Proof Taking into account that f = (Re ei —(Re f)” +i[dm fy —(Im f)7], 
it suffices to consider here only the case of positive functions f € CL? (IR). By 
Remark 11.5.5, there exists an increasing sequence (f,,), of step functions such that 
Stn — f pointwise. Then || fn — f lz» — 0, according to Beppo-Levi Theorem. For 
the density of C.(R) and CO°(R), adapt the argument of Theorem 11.5.8. 


11.8.6 Definition For A a measurable subset of IR, we denote by L(A) the set of 
all essentially bounded measurable functions, that is, of the measurable functions 
f : A— C such that 


Ilfllzce = f sup | f(x)| < 09; 


in 

XCA,A(X)=0 xeA\X 
we call the number || f||z© the L°-seminorm (or the seminorm of index L™) of f. 

L(A) is a linear space (in fact a commutative unital algebra). On this space, 
|| « ||ze0 is a seminorm. 

For all spaces £?(A) (with p € [1, oo]), it is true that 

f € L(A) implies | f| € L(A). 
This makes all subspaces 
LR(A) = {f € L(A): f real-valued} , 

linear lattices of functions. 


11.8.7 Proposition Let A be a measurable set of finite measure. Then for all expo- 
nents r and s with 1 <r <s < ©, we have the inclusions 


E*ADECPACLCHOEL GC), 
and the inequalities 


Uf ller < Ulf llzs QCA)", for every f € L°(A). 
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Proof The inclusions follow from Lebesgue’s Criterion of Integrability. The inequal- 
ities are consequences of Rogers-H6lder Inequality: 


[itirax 
A 


r/s l=r/s 


/ fo!” dx i dx 
A 


A 


IA 


Flee QCA. 


Noticing that 
| fllze = O if and only if f = 0 almost everywhere, 


one can associate to each of the seminormed spaces £?(A), a Banach space L? (A), 
following the model of the space L!(R). For p € (i, oo), L?(A) is referred to as 
the Banach space of all classes of Lebesgue p-integrable functions on A. L(A) is 
referred to as the Banach space of all classes of essentially bounded functions on A. 
When dealing with classes of real-valued functions, we will denote the corresponding 
spaces by LR (R). They are linear lattices of functions with very nice properties. See 
Exercise 8. 

The reader should be aware that working with elements of an L?-space is the 
same as working with representatives, that is, with elements of the corresponding 
LP -space. 

In the applications of L?-spaces, it is useful to notice that 


L? ([a, b]) = L? (a, b)) and L? ([a, b]) = L?((a, b)), 


for all pairs of real numbers a < b and all p € [1, oo]. Indeed, the modifications on 
Lebesgue null sets change neither the integrability of a function nor its integral. 
More on the L?-spaces can be found in the next section and in Appendix B. 


Exercises 
1. Letu, v € Cand p € [1, 00). Prove the inequality 
jut ul? <2?! (\ul? + lvl?) . 


2. (The extension of Rogers-Hélder Inequality). Let N > 1 be an integer and let 
Sk © £P*(A) be functions fork = 1,..., N. The exponents p; are assumed to 
be in [1, oo) and yy 1/px = 1. Prove that f;--- fy € £'(A) and 


N 


N 
fe) <[ [feller - 
1 k=1 


k= L 
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Let A be a measurable set of finite measure. Show by an example that the 
inclusion 


LAye (| ea 
p=! 


is strict. 


. Let p € (1,00). Give an example of a function f : (0,00) — [0, oo) that 


belongs to £?~* ((0, 00)) for all ¢ > 0 small enough but does not belong to 
LP ((0, 00)) . 
[Hint: Use the series >) 1/n.] 


. Prove that equality occurs in Minkowski’s Inequality if and only if either f is 0 


a.e., org = Af a.e. for a suitable nonnegative number 2. 


. (a) Prove that all spaces £? (IR), with 1 < p < o, are separable. 


(b) Prove that if a Banach space is separable, then every subspace of it is also 
separable. Infer that all spaces £? (A) , with 1 < p < oo and A a measurable 
set, are separable. 

(c) Show that £°°(N) can be identified with a subspace of £° (IR) via the linear 
isometry 


foe) 
T : £°(N) > L°(R), TT (Gn)n) = >) an Xtnn4)- 
n=0 


Infer that £°° (IR) is nonseparable. 

[Hint: (a) Consider the family of characteristic functions of bounded intervals 
with rational endpoints and then the family of all linear combinations of such 
functions with coefficients in Q+iQ.] 


. (Chebyshev’s Inequality for Lebesgue measure). Let A be a measurable set with 


0 < A(A) < ow and let f € le (A) N LRA) be a function with mean value 
p= nev J, f @) dx and standard deviation 


1 . 1/2 
o=(—— | (f@)-) ax) : 
a i) 
Prove that for every ¢ > 0, we have the inequality 


ie As |fQ\— wl ee)< =. 


. (a) Prove that all Banach spaces ie (R) (with 1 < p < oo) are linear lattices of 


functions with respect to the ordering 
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f<gin Le (R) if and only if f < g almost everywhere 
at the level of representatives. 


(b) If 1 < p < ov, show that every increasing sequence in Le (IR) that is 
bounded above is convergent (to its least upper bound). 

(c) Prove that in every space Le (R) with 1 < p < o, the bounded above 
sequences have a least upper bound. 

[Hint: The upper bounds of a sequence (f;,)n are the same with those of the 


increasing sequence ( sup fx)n-] 
O<k<n 


9. Prove that the space C([{—z, 2], R) of continuous functions is dense in Lh 


([—z, x]). 
10. (Hardy’s Inequality). Suppose that p € (1, 00) and f € L? ((0, 00)). 
(a) Prove the inequality 


P 
IF llze < —— Ilfllee. 
p-1l 


where 
1 r 4 
F(x) = ~ f fear, x>0. 
x 
0 


(b) Verify the optimality of the constant p/(p — 1), by considering the case of 
functions of the form f(t) = t—'/P x6," (t), fort > Oandn EN. 

(c) Infer the discrete form of Hardy’s Inequality: For every sequence (dy), of 
positive numbers and every exponent p € (1, 00), 


of) P\ 1/P i oS l/p 
(=(>-) ) = 5° (>-) | 
n=1 n k=1 P n=1 


11.9 The Fourier Transform 


The Fourier transform of an integrable function f € L! (R)is defined by the formula 


f (@) = ya {for dx, weR. 


The existence of the right-hand side integral is motivated by Lebesgue’s Criterion 
of Integrability. According to Theorem 11.7.1, f is a continuous function. It is also 
bounded because 
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ee I 
|f (@)| < Tz fllzi forall. (11.12) 


The Fourier transform of an integrable function is not necessarily integrable. An 
example is provided by the function f(x) = e~* x;0,00), Whose Fourier transform is 


= 1 
= ; R. 
ue J/2n (1 +i) . 


11.9.1 Riemann-Lebesgue Lemma /f f € L! (R), then 


_ f (@) = = 0. 


|o|—> 


Proof Consider first the case where f is the characteristic function of a bounded 
interval, for example, f = x{a,»). Then, clearly, 


_ 1 ; i (ew _ a) 
(w) = — | e (dx = ——_____ 
f 20 o/20 
a 
tends to 0 as w — oo. Using the linearity of the integral, we extend the conclusion 
to the case of step functions. In the general case of a function f € L! (IR”), we 


approximate it in the seminorm ||-||;1 with step functions f,. Taking into account 
(11.12), we have 


\F@| < || + | FF 7)(0)| <|fe(@)|-+ = =I — fells, 


and the conclusion of the lemma is now clear. 


Let us denote by Co(R) the space of all continuous functions h : R > C null at 
infinity in the sense that 
lim A(x) = 0. 
|x| 00 


By Corollaries 6.5.3 and 6.5.6, the functions in this space are bounded and uniformly 
continuous. Co(R ) is a Banach space with respect to pointwise operations of addition 
and multiplication by scalars and the sup norm. 

The above discussion outlines a linear and continuous map, 


FL ROR), FN=F, 
which will be referred to as the Fourier transform on L! (IR). One can prove that 


F is injective, but this fact is not used in this book. The proof can be found in [1], 
Theorem 9.12. 
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An interesting feature of Fourier transform is that it maps the differential 


N N 
. . . nm . . . 
polynomials with constant coefficients, >) cy a into usual polynomials, ° i" c,x". 
n=0 n=0 
See Exercises 5 and 6 for applications. 


11.9.2 Lemma (a) Let i= L! (R) be a function such that x" f € L' (R) for the 
natural number n. Then fis of class C" and 


a" f 


For = i 


(b) Iff € C"(R) and f® © £! (R) fork =1,...,n, then 


ae 

ar = ey" F. 
Proof (a) Apply Theorem 11.7.2. (b) Use integration by parts and the fact that 
| ee fPG) = 0 fork =0,...,n—1; the existence of this limit follows from the 
X|—0o 


formula f(x) = f (0) + fo fT? dt and the integrability of fT) (x). 
We next present a linear subspace on which the Fourier transform induces a linear 
isomorphism. 


The Schwartz space, or the space of rapidly decreasing functions on R, is the 
function space S (IR) consisting of all functions f € C°(R) such that 


m 
If llsmn = Sup (1 +x?) Lf (x) | <0, forall m,n éeN. 
xeR 


Clearly, C>° (R) C S (R). This inclusion is strict because eo belongs to S (R) 
and its support is IR. On the other hand, 


S(R) cL'RNL?R), 


fora 


since every f € S (R) is continuous and verifies the inequality | f(x)| < S 
suitable constant C > 0. 


11.9.3 Lemma A function f € C®™ (R) belongs to S (R) if and only if for every 
polynomial function P and every natural number n, the function Pf” is bounded. 


Proof The sufficiency part is immediate. As concerns the necessity, if P is a poly- 
nomial function of degree m, then there exists a constant C > 0 such that 


|P (x)|<C(L+x7)", forallx ER. 
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Therefore, 


|Poo so] s CU +37" 


fO@)| <Clfllsmn» forallx eR. 
11.9.4 Corollary If f € S(R), then Pf € S (R) for every polynomial function 
P and every natural number n. 


S (R) is a linear space. It is also a commutative algebra with unit (the function 
identically 1). The fact that the product of every two functions f, g € S (IR) belongs 
to S (R) follows from Lemma 11.9.3 and Leibniz’s Formula, 


n 
n x 
w= > ee. 
k=0 


Thus, if P is a polynomial function and n is a natural number, then P - (fg) is 
a finite sum of products of polynomials multiplied by derivatives of f and g, that is, 
a finite sum of bounded functions. By Lemma 11.9.3, we conclude that fg € S (R). 


11.9.5 Lemma The Fourier transform of every function f € S (R)is also in S (R). 


Proof It suffices to show that for every pair of natural numbers m and n, the function 
(iw)” “EL is bounded. By Lemma 11.9.2, 


dw" 
iw)” of GataeF oS qm (cis) f) . 
a) dx 


and we know that the Fourier transform of every integrable function is bounded. 


In connection with Lemma 11.9.5, it is important to notice the case of the function 
2 ; a PX 2 : 

y (x) = e~* /?, whose Fourier transform is itself, that is, ((w) = e~® /?. This fact 

is equivalent to the assertion that the function 


flo) = e”!? Go) 


is identically 1. We have 


1 2 
an —*°/2 dy = 1. 
FO) f° P 


d@ _ d 1 / et [2,-ioxg, = i 
dw do \ /2xn M20 
R 


A 
Q 
S| a 
ao 
any 

4 
N 
a 
nN 
iin 
tax) 
J. 
8 
a 
Q 
S 
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i —x?/2 .—iwx me . / —x?/2 —iwx ~ 
— e e +ioa|e e dx | = —w@, 
J 20 —oo 
R 
it follows that 
Of > ga Pa ae PY a ee ee = 0. 
dw dw dw 


Therefore f = 1 on R and thus @(w) = en 12, 
The map 


n~ 


F:S(R)>SR), F(fPM=f, 


induced by the Fourier transform is linear. It is also bijective (and thus, an algebraic 
isomorphism of linear spaces), due to the following result: 


11.9.6 The Fourier Inversion Theorem /f f € S (R), then 


fa@s = / Fo) el deo 


As a consequence, 


f= (cn. 


Proof For every pair of functions f, g € S (R) , we have 


/ a (@) Fo) do = } gle) ef / Fe dy | deo 
R R R 
= = | a [ oreo ay ) ao 
R R 
=F [| 9) [roto edt | dw 
R R 


= [rote ao e!?! da ) dt 
R R 


=f rotooo dt. 
R 
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We will apply the above formula for g (x) = g(ex) = e7 (ex)”/ 2 where ¢ > O is 
arbitrarily fixed. In this case, 


a 1 =i 1 / —i(w/e)u 
wo) = — Ex)e dx = —— u) e du 
g(@) a | ¥ ) oe g (u) 
R R 
1_ so 
(2) 
€ € 
so, we obtain the formula 


E 


/ y(ew) f (we! do = / POF ELT dr 
R R 


= few feuts du. 
R 


Passing to the limit in both sides as e — 0, we infer that 
| Fore deo = po [Fw du= Fo foc au 
R R R 


= fos) | edu = Jin f (x), 
R 


that is, 


f@s = | Flo) el deo 


The Fourier Inversion Theorem allows us to prove easily that the Fourier transform 
establishes a bijection from S (R) onto itself. We start with injectivity. If f =%, 
then 7 = 4 and thus f (—x) = g (—x) for all x € R. As concerns the surjectivity, 
let f € SR). Then g = f (—x) € S (R) and the Fourier Inversion Theorem shows 
that g = f. Therefore f = F(). 

The proof of the Fourier Inversion Theorem outlines the formula 


/ gw) fw) e* deo = / ft Gs) dt, (11.13) 
R R 


valid for all f, g € S (R) and x € R. Since C2°(R) is dense in L! (R), S (R) is also 
dense in L! (R). This allows us to extend the formula (11.13) for all fe L'(R) 
and g € S(R). Then for f € L! (R) a function such that fe L! (R) and g(w) = 
ene eo /2 (where ¢ > 0 is a parameter), we obtain 
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2.2. > Bt 2: 
fe 2 F(a) el da = / flett+xje' dt 
R R 
for almost every x. Passing to the limit over ¢ — 0, we conclude that 
/ flo) e!* dw = J/2z f (x) for almost every x, 
R 


which yields the following conclusion: 


11.9.7 Proposition /f f and f belong to L' (R), then 


A 
nN 


f(x) = f(—x) almost everywhere. 


For x = 0, the formula (11.13) yields the equality 


| Forge) dw = i f@g(t) dt forall f,geS(R), 
R R 


which in turn yields the following important result: 


11.9.8 Lemma (Parseval’s Identity) For all f, g € S(R), 


(f.g)22 = (FO 2- 


In particular, 


Wf lle2 =F illz2- 


Parseval’s Identity makes possible to define the Fourier transform, by a continuity 
argument, to a bijective linear isometry of L (R) onto itself. On L!(R)N L?(R), this 
definition agrees with original Fourier transform defined on L! (R), thus enlarging the 
domain of the Fourier transform to L! (R) + L?(R). This fact, known as Plancherel’s 
Theorem, will be proved in what follows. 

Since C&°(R) is dense in L? (R), S(R) is dense in L? (R) too. Therefore, 
every function f € L* (R) admits approximating sequences (f,), consisting of 
functions in S (R). A convergent sequence is also a Cauchy sequence and thus, 
| Si — Sk | 12 7 Oas j,k — oo. By Parseval’s Identity, 


| fi — fellp2 > 0 as j,k > ww. 


Since L? (IR) is complete, the sequence CAP must be convergent in L? (IR). 
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The Fourier transform of f is defined as 
Ff = lim fy. 
n—- oo 


The definition of F f does not depend on the particular approximating sequence 
(fn)n- Indeed, by the above argument, the Fourier transform maps any such approx- 
imating sequence into a convergent sequence. Then consider two approximating 
sequences (f;,), and (fn) n- By interlacing their terms, we obtain a new approximat- 
ing sequence 


So. fo, Si, fi. fr, Poveda ‘i 
The sequence of their Fourier transforms must be convergent in L?(R), whence 
jim, fa = a jim a 
> 0o 


~The sass : L? (R) > L? (R) so defined is linear. It is also bijective. Indeed, 
by the Fourier Inversion Theorem, 


fa) = f (—x) forevery f € S(R), 


and this fact can be extended easily to every function in L? (R). 


11.9.9 Remark (The extension of Parseval’s Identity) For every f, g € L7(R), using 
appropriate approximating sequences and Parseval’s Identity, we infer that 


(FF, FQ)i2 = (lim fa, lim Ga)22 = lim (fa, Gn)2 = lim ( fas gn) 2 
noo n->oo n->oo n—->oo 
= (f, 9) 12 


In particular, || F f\|;2 = || fllz2. 


11.9.10 Remark On L! (IR) L?(R), the Fourier transform defined on L? (R) agrees 
with the original Fourier transform defined on L!(R). Indeed, for every R > 0, 
the inclusion operator L?([—R, R]) > L({—R, R]) is continuous and thus, the 
L*-Fourier transform of every function h € L?({—R, R]) (extended with 0 outside 
[—R, R]) is the same with its L'-Fourier transform. Therefore, if fe L?(R), all its 
truncations f - x;—r,R] belong to L! (R) NL? (R), and 


ee 


l(f - xr-R,Ry)) — fllz2 2 0 as R > oo. 


Taking into account the above remark, we obtain for the L*-Fourier transform of f 
the formula 


f(o) = lim. Fe | rovers 
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where |.i.m. (limes in medio) means that the limit is taken in the sense of metric 
topology on L? (R). If fe L'(R) N L?(R), the two Fourier transforms coin- 
cide. Indeed, by Theorem 11.8.3, there exists a sequence R, —> © such that 
tors les f@)et*dx > fo) almost everywhere, while the presence of f in 


L!(R) assures that 


R oo 
ie oe ee al fae" de, 


uniformly as R > oo. 
We end this section with a famous result that has deep consequences in Quantum 
Mechanics. 


11.9.11 Theorem (The Uncertainty Principle) Let f € L?(R)NC!(R) be a function 
which verifies the following four conditions: || f \|,2 = 1, xf (x) € UR), f € 
L?(R) and ih J |x| f (x) = 0. Then 

X|—0o 


fe ireor dx |e lFer? de 2 


R R 
provided that the two integrals exist. 


By Parseval’s Identity, If llz2 = 1, so we may regard f(x)? and | f(@)|2 as 
defining probability density functions. Taking into account the formulas (a) and (b) 
in Exercise 1, we may assume that the mean of each of these densities is 0, that is, 


xtreor dx = f olf? dw = 0. 


R R 


Thus, the probabilistic meaning of the two integrals in the statement is that of disper- 
sion (of | f (x)|? and | f(@) |?) and the inequality asserts that the more concentrated 
f is in the neighborhood of zero, the more spread out must f be, and conversely. In 
Signal Theory, the Uncertainty Principle shows that one cannot jointly localize a sig- 
nal in time and frequency arbitrarily well; either one has poor frequency localization 
or poor time localization. 

Theorem 1 1.9.11 is used in Quantum Mechanics to prove Heisenberg’s Uncer- 
tainty Principle which says that we cannot measure simultaneously the position and 
the momentum of a particle with high precision. The more accurately we know one 
of these values, the less accurately we know the other. 


Proof We will restrict here to the case where f is real. According to the Cauchy— 
Buniakovski-Schwarz Inequality, 
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2 


/ xf (x) f'(x) dx} < J x? #7 (x) dx 7 fade, 


an application of integration by parts yields 


(oe) 2 oS fore) 
[fers = ae / SPP @)dx = —5. 


—co 


By Parseval’s Theorem, 


/ f?(x)dx = / wf *(w) dw 


and the conclusion of the theorem is now clear. 


Applications of the Fourier transform to differential equations are illustrated in the 
next section. 


Exercises 


1. Suppose that f € £!(R) and q@ and Ad are real numbers. Prove the following 
assertions: 
(a) the Fourier transform of the function g(x) = f(x)e’** is 


Go) = fw — a); 
(b) the Fourier transform of the function g(x) = f(x — @) is 
Go) = fle; 


(c) the Fourier transform of the function g(x) = f(—x) is g(w) = f(); 
(d) the Fourier transform of the function g(x) = f(x/A) for A > 0 is G(w) = 
Af Qo). 
2. Prove that the Fourier transform of the function e~'*! is the function re ee 
3. Prove that the Fourier transform of the function 


foy=| if |x| <a 


0 if |x| >a 


is the function ; 
~ | 2 if e€ R\ {0} 
Fw) = | a ifw=0. 
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Notice that both functions belong to L?(R) and prove that the Fourier transform 
of f is f. 
4. According to Parseval’s Identity, 


(IF fllz2 = lf llz2 for every f € L* (R). 


Use this fact to prove that 


and 


whenever a > 0. 
(Hint: Consider the functions x,~a,a] and (1 — |x| /a) x[~a,a}-] 
5. Use the Fourier transform to show that the equation 


fe +24 fixe—-2=2f"@+e*, xeR, 


has a solution in the space S(R). 
6. Prove, using the case of equality in the Cauchy-Schwarz Inequality, that equality 
holds in the inequality appearing in Theorem 11.9.11, only for functions of the 


form f(x) = Cen, where C € Randa > 0. 


11.10 Applications of Fourier Transforms 


The applications of Fourier transform to linear differential equations and probability 
theory are based on its main features: linearity, the property to convert differential 
equations into algebraic equations and the fact that it transforms the convolution 
product to usual multiplication (and vice versa). 


11.10.1 Lemma /ff, g €S(R), then: 


(a) fxg eS; _ 
(b) fxg =V2x f-G; 
() fo = Fehrs. 
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Proof (a) We start noticing that 


I+x27=1+(—yty)? <14+2@—-y)?+2y’ 


<2(1+(-y)’) (1+)”) 


for all x, y € R. By combining this fact with Corollary 11.9.4 and the Cauchy— 
Buniakovski—Schwarz Inequality, we infer that 
d" f(a —y) 


(1+x7)" rr < [ (1+2)" a 
R 
<2" { \(1+6-y?)" FE | \(145)" sor] 
R 


(42) SL Ieey 


dx” 


7) dy 


< gm 


L2 


which yields the membership of f * g to S( R). 
(b) By Fubini-Tonelli Theorem and the property of invariance under translation 
of Lebesgue integral, we get 


Fao = Fe [seoee ms 


=| | F(x — y)g(y) dy |e? ax 


- a 


R 
a =| i Fla — ye dx ) g(y) dy 
R 
1 —iwt —iwy 
va" dt) gy)e dy 


II 
Foc 


= fw) = te Za | gly) e™ ay 


= V2x f(w)yGo). 


(c) Use (b) and the inversion formula f (x) = f(—x). 


Using an easy approximation argument, we can extend the assertions (b) and (c) of 
Lemma 11.10.1 to the case where f € L!(R) (or f € L7(R)) and g € S(R). 
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A simple example illustrating the usefulness of Fourier transform to solve differ- 
ential equations is provided by the equation 


= 7" 4a n= FH); (11.14) 


where a > Oand f € C(R) are given data. A solution (in the classical sense) is any 
function u € C* which verifies the equation at all points of R. For the existence of 
such solution we must assume that f is continuous. If v is a particular solution, then 
w =u — v Verifies the equation 


—w"+a’w=0 


and thus, w is a linear combination of the functions e~“* and e“. See Sect. 8.2, 
Exercise 7. This reasoning reduces the problem of solving the differential equation 
(11.14) to the problem of finding a particular solution. Here comes the role of Fourier 
transform. We assume f € C(IR)ML7(R) and seek fora solution v € C?(R)NL?(R) 
such that v’ € L?(R). By applying the Fourier transform to the equation (11.14), we 
obtain 


(0? +47) de) = Fo), 


which yields 


Denoting by G the inverse Fourier transform of 


~ 1 
G = —>— 
=a 
we infer from Lemma 11.10.1(c) that 
1 
v= ——f *G. 


V20 


The expression of G can be deduced from Exercise 2, Sect. 11.9: 
1 
G(x) = —e@F I, 
) 2a : 
Therefore, 


v(x) = fx — ye 4 dy. 


1 
2a/ 20 cs 
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The reader should be aware that under our weak assumptions on f, this formula 
provides only a generalized solution since the equality —v" + a*v = f(x) holds 
only almost everywhere. See Sect. B5. A classical solution is obtained for f ¢ S(R). 

The Fourier transform provides a very useful tool for solving problems involv- 
ing partial differential equations, that is, equations involving partial derivatives. We 
already met the partial derivatives with the occasion of differentiating under the 
integral sign. See Sect. 9.4. 

The Cauchy problem for the diffusion equation on the real line seeks for a real- 
valued function u, which is continuous on R x [0, 00), of class C 2onRx (0, co) 
and verifies the equation 


a a? 
a =o ogi for allx € Randt > 0 (DEq) 


together with the initial condition 
u(x,0) = f(x) forallx eR. (IC) 


Here, a > 0 is a constant. 

A rigorous motivation of the diffusion equation can be found in the book of 
Vladimirov [3], pp. 30-32. 

We will assume first that f € S(R) and we will search the existence of solutions 
u such that all partial functions x > u (x, t) belong to S(R), whenever t > 0. 

Let @ be the Fourier transform of u with respect to the variable x (t is considered 
a parameter). Then, @ appears as the solution of the problem 


| me t) —e ort 
u(w, 0) f(@) 


and thus, it is of the formu = C(@) eww for a suitable constant C. The condition 
at t = 0 yields C(w) = f(w). Therefore 


Go, t) = Foye 4". 


For t > 0 arbitrarily fixed, the function wm > ene art belongs to S (R) and 
therefore there exists a function EF (x, t) € S (R), depending on the variable x and 
the parameter ¢, such that E (w,t) = ef ®t This allows us to represent @ as 
product of Fourier transforms, 


=) 

ll 
a 

ie) 
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According to Lemma 11.10.1, u verifies the formula 


u(x,t)= Sete 


1 
— | E(x- y,t)fQ)dy , forallx eR, t>0. 
al y, t) f(y) dy 


This formula is accompanied by the initial condition, 
u(x,0) = f(x), x ER. 
The computation of E (x, t) follows from the Fourier Inversion Theorem: 


E(x,t)= E (a, t) &*° dw = eT Ot gi XO qe, 


| ya | 


2 
(aw/2i— is.) f2 o78?/(4a71) a 


a 
ous 


The above reasoning shows that when the initial data f belongs to S(R) and 
the problem (DEq) and (IC) has a solution u, this solution must verify Poisson’s 
Formula: 


eo /(4a"2) : 


f (&) if r=0 
= —~(x—y)2 2 : 
u(x,t) = sts f f(yeTO- 4D dy if t > 0. (11.15) 


Or, using Theorems 11.7.1 and 11.7.2 (on dependence of integrals on parameters), 
one can prove easily the following result: 


11.10.2 Theorem Poisson’s Formula produces a solution u (continuous on R x 
[0, 00) and of class C? on R x (0, 0)) for the Cauchy problem (DEgq) and (IC) 
whenever the initial data f is continuous and bounded on R. 


Proof The fact that u verifies the diffusion equation is a straightforward computation 
motivated by the possibility to differentiate (repeatedly) under the integral sign. We 
are thus led to consider integrals of the form 


tf yam fe OMY dy 
R 
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in the domain (x, t) € R x [f9, 00), for fo > O arbitrarily fixed and k and m positive 
real numbers. The change of variable 


y-x 


L£= 
2QaJ/t 


transforms this integral into 


Qayntsaee f f(x + 2az/t) mene dz. 
R 


The absolute value of the integrand f(x + 2az,/t) z™e-*" ig bounded above by an 
integrable function that does not depend on x and f (because f is bounded). 
Once the possibility to differentiate under the integral sign is established, the 


fact that u is a solution of the diffusion equation follows from the remark that 
1 ,—x"/(4a1) 

av 2t 

The proof ends by showing the continuity of u at t = 0, that is, 


v= is a solution. 


lim u(x,t) = f(x). 
t—>0+ 


Indeed, 


1 2 2 
|u(x, t) — f(x)| < ——= i. | f(y) — fF) e742" ay 
2a/ ti J 
1 f | 55 
=) ae fx +.azV2t) — f(x)| e* ? dz, 
20 Z 


and denoting M = sup | f(x)|, the last integral can be bounded above by 
xeR 


Jin : | £ + azv21) — F(x) ede 
IU 
\z<8/(av21) 


4 2M / 2/2 d 
———— C=. 5 
J 20 


\2l26/(av21) 
and next by 
sup fa+2—- fO)|+2M->—e i aFl? ae. 
|z|<6d Jon 


\z1=8/(av21) 
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The first term tends to 0 as 6 — O, and the second term tends to 0 as t > O. This 
assures the continuity of u at t = 0. O 


Exercises 


1. (An introduction to Central Limit Theorem). Consider the rectangle function IT, 
defined by I(x) = 1 if x € [—1/2, 1/2] and I(x) = 0 otherwise. 
(a) Compute IT * I, TI * II * TM and I * TI * TM * I. Notice that convolution is 
a smoothing operation. 
(b) Prove that /n(I] * 1 *--- * N)(/nx) > tee? for every x € R. 

n factors 

[Hint: (b) Compute the convolution product using Fourier transform. ] 

2. Apply the Fourier transform to find the solution of the following integral equa- 
tion: 


f@)+ / f(@—-t) eh [2 dt = e“ X10,00) (*), xeR. 
R 


3. Solve the Cauchy problem 


4. Solve the Cauchy problem 


du a? 
oT = a +e'sinx, u(x,0) =sinx, 
x 


seeking for a solution of the form u(x, t) = v(t) sinx. 


11.11 Notes and Remarks 


An account on the origins and development of Lebesgue’s integral can be found in 
the book of Hawkins [4]. 

There are many books containing a thorough presentation of the theory of 
Lebesgue integral. See Folland [5], Gordon [6], Hewitt and Stromberg [7], Lang 
[8], Royden [9], Rudin [10] and Willem [11], to cite just a few. Our approach here 
tried to keep at minimum the elements of measure theory at the cost of two technical 
lemmas (Lemmas 11.3.2 and 11.3.3), whose details are given in a separate section. 

The property of absolute continuity plays a major role in Probability Theory, via 
the Radon-Nikodym Theorem (a result related to Theorem 11.6.3). See the book of 
Chow and Teicher [12] for a thorough presentation. 

The measurable functions are nearly continuous in the sense of the following 
result due to Nikolai Luzin. 


432 11 The Theory of Lebesgue Integral 


11.11.1 Luzin’s Theorem /f A is a measurable subset of R of finite measure and 
f : A — Cis a measurable function, then for every ¢ > 0, there exists a compact 
set K, included in A such that X(A\ K-) < & and the restriction of f to Kz is 
continuous. 


Proof (due to Peter Loeb and Erik Talvila) According to the regularity property of 
Lebesgue measure, for every measurable subset A of R with A(A) < oo and every 
€ > 0, there exists a compact set K C A such that A(A \ K) <e. 

Let (V;,)n be an enumeration of the open intervals in R with rational endpoints 
and fix compact sets Ky, C fv) and RK Cc A\ fO) for each n such that 
MA\(KynU Ky) < €/2”. 

The set Ke = (\(KnU Pe verifies A(A \ Kz) < e. Given x € K;,, for every 


n 
n such that f(x) € V,, the point x belongs to the open set U, = R\ Kn and 
f(Un O A) C Vy. This proves the continuity of f at x. 


Anexample of a nearly continuous function that is not almost everywhere continuous 
is the characteristic function of a nowhere compact subset K of [0, 1], of positive 
measure. Such a set can be obtained by a process similar to that of constructing 
Cantor’s triadic set. The only difference is that at each step we eliminate the open 
middle fourth intervals. 

The Hoélder—Rogers Inequality was proved in slightly different forms by Leonard 
James Rogers in 1888 and then by Otto Hélder in 1889 (Hdélder even referred to 
Rogers!). See the paper of Maligranda [13]. 

The L? spaces were introduced by Riesz [14] in 1910. 

The Lebesgue measure can “measure” only the sets in the o-algebra 22(R). It 
is natural then to ask how large this family is compared to P(R), the o-algebra of 
all subsets of R. The following example (constructed by Giuseppe Vitali in 1905), 
shows that 


OR) £ P(R). 


His argument, which will be reproduced here, makes use of the Axiom of Choice. 

For x € [0,1], we define E, = {y € [0,1] : y —x € Q} and we consider 
the family F = {E, : x ¢€ [0, 1]}. Obviously, F is nonempty and consists of 
nonempty disjoint sets. Using the Axiom of Choice, there is X C [0, 1] such that for 
every x € [0, 1], X M E, has exactly one element. We will show that this set is not 
measurable. 

Suppose it is. Then, the translates of X, 


X,=X+r (reQ) 


are also measurable and have the same measure as X. It is easy to see that forr, s € Q, 

r #58, we have XX; =. Moreover, (J  X; C [0, 2]. Using the countable 
reQn[o, 1] 

additivity of the Lebesgue measure, we get the estimate 
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> *%)=~20 YX) sal, 2) =2. 


reQn[o, 1] reQn[o, 1] 


Since 4(X;) = A(X) for all r € Q, we infer that A(X) = 0. 
On the other hand, [0,1] CR c UL X; and thus 
reQ 


1=2(0,1) <A) G= > 1) = 0, 
reQ reQ 


which is a contradiction. 

The case of the unit interval is not singular. Every measurable set of strictly 
positive measure contains a non-measurable set. 

It is worth mentioning that Solovay [15] proved in 1964 that the existence of non 
measurable sets cannot be proved without the Axiom of Choice. His argument uses 
the ZF system at which he added the WIC axiom, concerning the existence of weakly 
inaccessible cardinals. In the ZF + WIC system, a weaker form of the axiom of 
choice works, called DC (the axiom of dependent choice) and which makes possible 
inductive definitions. In 1980, S. Shelah proved that Solovay’s result is optimal: from 
the consistency of the system 


ZF + DC + (N(R) = P(R)), 


it follows the consistency of the system ZF + WIC. See the monograph of Wagon 
[16] for more details. 

Despite the difference between Lebesgue null sets and first Baire category sets, 
they are related by a duality result due to Wactaw Franciszek Sierpinski and Paul 
Erdos: 


11.11.2 Theorem (Sierpinski-Erdés Duality Theorem) Let P be any proposition 
involving solely the notions of measure zero, first category and notions of pure set 
theory. Let P* be the proposition obtained from P by interchanging the terms “null 
set” and “the set of first category”, wherever they appear. Then each of the propo- 
sitions P and P* implies the other, assuming the continuum hypothesis. 


Details are to be found in the book of Oxtoby [17]. See also the paper by Diamond 
and Gelles [18]. 

The theorems concerning the dependence of Lebesgue integral on parameters are 
very important in applications. A glimpse is given by the last two sections on Fourier 
transform and its applications to differential equations. 

The approximation technique of Dirac sequences is due to Sergei Sobolev (1938) 
and to Kurt Otto Friedrichs (1944). Friedrichs used the so called mollifiers, which 
are illustrated by Theorem 11.7.6. 

A thorough introduction to the problems of mathematical physics (in particular, 
to diffusion equation) can be found in the book of Vladimirov [3]. 
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An account on the history of the Cauchy—Frullani integral (Sect. 11.7, Exercise 7), 
as well as some improvements can be found in the paper of Ostrowski [2]. 

Computing integrals in compact form is rarely successful but the Tables of Inte- 
grals, Series and Products by Gradshteyn and Ryzhik [19] provide a valuable source. 
Here is an example: 


[o.@) 
xP- logx 7 cos ea 
x= or0 < p <q. 
14x49 q? sin? - a“ 
0 


In 2004, Borwein and Bailey [20] verified to over 20,000 decimals accuracy that 


24 


tant + V7 

—— (@) —————— 

If . 
m/3 


tant —J/7 


ay 1 n 1 1 
- oa (Int 1)? (In+2)2 (In +3)? 


1 1 1 
*Gn+42 Gn t5y axel: 


but a proof of this “identity” is not yet known. 

Related to Fourier transform is the Laplace transform , another integral transform 
with very important applications in physics and engineering. This transform asso- 
ciates to a measurable function f : R — C which vanishes for t < 0 and verifies 
some suitable growth conditions at infinity (for example, | f(t)| < Me“ for some 
constants M > 0 anda € R), an analytic function defined by 


[ee 


CN @= fe smar ze, Rez> oy, 


0 


where wf € [—00, 00). The theory of Laplace transform is presented in the classical 
book of Widder [21]. 

If the Laplace transform £f of a measurable positive function f : (0,00) > R 
exists in the Lebesgue sense for every z > 0, then ® = Lf isa completely monotonic 
function , that is, it has derivatives of all orders and 


(-—1)"®™ (x) > 0 forallx > OandneN. 


In particular, the functions (a + bx)~* (for a,b,A > Oanda+b > 0), e% 
(fora > 0), In (1 + 1), the Gauss hypergeometric function F(a, 8, y, —x) (for 
y > B > Oanda > 0) (as well as all their derivatives of even order) are completely 
monotonic functions. Every completely monotonic function is also log-convex. The 
converse fails as shown in the case of Gamma function. A thorough presentation of 
the theory of completely monotonic function can be found in the aforementioned 
book of Widder and the recent papers of Merkle [22] and Miller and Samko [23]. 
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Chapter 12 
Fourier Series 


Fourier series decompose periodic functions or periodic signals into the sum of 
a countable family of simple oscillating functions, namely sines and cosines (or 
complex exponentials). This fact offers a great advantage in handling a series of 
problems such as: motion of vibrating structures, heat diffusion, signal processing, 
electrical circuits with oscillating current sources, etc. During the last two centuries, 
the study of Fourier series had a great contribution to the rigorization and development 
of Analysis. Dirichlet, who is credited with the beginning of the rigorous study of 
Fourier series in 1829, precisely in this context anticipated the general concept of 
function. Riemann created his integral within his memoir on trigonometric series 
(prepared in 1856, posthumously published in 1867). Jordan (1881) introduced the 
class of functions of bounded variation trying to extend Dirichlet’s convergence test. 
The uniqueness of the representation of a function as the sum of a trigonometric series 
led Cantor (1870) to introduce a series of topological concepts already mentioned in 
Chap. 5. In 1907, Riesz and Fischer related Lebesgue integration and convergence 
in L?-norm to the theory of trigonometric series (and more generally, to the theory 
of orthogonal expansions). And the list may be continued till nowadays. 


12.1 Hilbert Spaces: Orthogonal Expansions 


The main feature of Euclidean spaces is the presence of a scalar product. We will 
enlarge the study of this concept by considering an abstract setting that encompasses 
the Lebesgue spaces of square integrable functions. 

A scalar product (or inner product) on a linear space H over the field K (K is R 
or C) is a function 


(,:):HxHoR 


with the following three properties: 
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(SP1) (x, x) > 0; (x, x) = Oif and only ifx =0 
(SP2) (x, y) = (y, x) 
(SP3) (ax + By, z) = a(x, z) + Bly, z) 


for every x, y,z € H andevery a, 7 € K. 
Thus, a scalar product is linear in the first variable (when the second is kept fixed). 
As a function of the second variable, we have 


(x, ay + Bz) = A(x, y) + B(x, 2), 


which is a condition of anti-linearity in the complex case and of linearity in the real 
case. In fact, in the real case, the condition (SP2) is a condition of symmetry, 


(x, y) = (y, x) for every x, y € H. 


The linear spaces endowed with a scalar product are called pre-Hilbert spaces 
(or inner product spaces). 
To each scalar product, one can associate a Hilbertian norm, 


llxl] = (x, x), 


12.1.1 The Cauchy-Buniakovski-Schwarz inequality [f H is a pre-Hilbert 
space, then for every x, y € H, we have 


I(x, y)1 < Hall yl. 


The equality holds if and only if x and y are linearly dependent. 


In the real case, the proof is the same with the proof of Theorem 3.1.2; the complex 
case needs only minor changes. 

The fact that the Hilbertian norm verifies the triangle inequality (and thus it is 
indeed a norm) can be deduced from the Cauchy-Buniakovski-Schwarz Inequality 
as follows: 


Ix + yl? = (x+y,x4y) 
= |Ix|2 + 2Re(x, y) + Ilyll 
< |x? +2 [ell yl + yl? 
= ([Ix|] + IlylD?. 


12.1.2 Lemma The scalar product is continuous in each variable (when the other 
is kept fixed) with respect to the metric topology associated to the Hilbertian norm. 


Proof The fact that x, — x implies (x,, y) > (x, y) for every y follows from the 
inequality |(xn, y) — (x,y) = (xn — x, y)| < [xn — xl |lyl|. The continuity in the 
second argument combines this remark with the property (SP2). 
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The pre-Hilbert spaces are not necessarily complete. See the case of the space 
C([a, b]), endowed with the scalar product 


b 
Hane / Fng@dr. (12.1) 


The pre-Hilbert spaces that are complete with respect to their Hilbertian norms 
are called Hilbert spaces. 

The simplest examples of Hilbert spaces are the Euclidean spaces. 

Hilbert himself studied the space €7(N), of all square summable sequences of 
complex numbers, that is, of all sequences x = (dy), such that 


& 1/2 
2 
Ixlleo = (> |an| ) < 00. 
n=0 


This is a linear space with respect to the coordinatewise operations of addition and 
multiplication by complex number and a normed linear space with respect to the 
norm ||-||,2 ; this norm is associated to the scalar product defined by the formula: 


CO 
(XY)2= > anbn, 


n=0 


for every X = (dy)n and every y = (by)y in ¢?(N). The completeness of ?(N) is 
left as Exercise 9. 

The Lebesgue space L*([a, b]) is another example of Hilbert space. Its norm 
\|-|| 2 is associated to the scalar product defined by the formula (12.1). 

It is easy to check that the completion of a pre-Hilbert space is a Hilbert space. 
The completion of C([a,b]) with respect to the norm of index L? is L*({a, b]) 
(because C([a, b]) is a dense subspace of it). For a proof, adapt the argument of 
Theorem 11.5.8. 


12.1.3 Proposition /f H is a pre-Hilbert space and x, y € H, then the following 
hold true: 
(a) The parallelogram identity, 


lx + yl? + le — yl? = 2 (ls? + Ilyll?) 
(b) The polarization identity, 


he hex i |x + yll? _ i \|x — vi? in the real case 
, i ey la |x + i*y| in the complex case. 
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It is important to notice that the pre-Hilbert spaces are the only normed linear 
spaces where the parallelogram identity works. See Exercise 1. 
In any pre-Hilbert space, one can define the concept of orthogonality. 


12.1.4 Definition Two vectors x and y are said to be orthogonal (that is, x Ly) if 
(x,y) = 0. 
More generally, a family of vectors (€q)ve, is said to be orthogonal if 
(€a,€3) =0 fora -¢p 
and orthonormal if 


_ 10 fora¢p 
(a, €g) = 1 fora={8. 


The natural algebraic basis of the Euclidean space R% is an example of orthonor- 
mal family. 
The real trigonometric system, which consists of the functions 


1, cosx, sinx, cos2x, sin2x,...,cosnx, sinnx,... 


provides an example of orthogonal family in each of the spaces Le (la; a+ 2r]) 
with a € R (in particular, in La(-n, 7])). By normalization (that is, by dividing 
each vector by its norm), we get the normalized trigonometric system, 


1 cosx sinx cos2x sin2x cosnx sinnx 


Jin Jn’ fm’ Jem’ Je dm? fm! 


In the spaces L*([a, a + 27]), a similar role is played by the complex trigono- 
metric system, which consists of the functions 


inx 


Xn(x) =e neZ. 
Each of these functions has norm ./27. 

Both real and complex trigonometric systems consist of periodic functions, of 
period 27. 

Each orthonormal family is equally a linearly independent family. Exercise 7 
describes the Gram-Schmidt orthogonalization algorithm, which associates to each 
family of vectors an orthogonal family generating the same linear space. 

Given an orthonormal system (e,), in a Hilbert space H, one can associate to 
each element x € H, its Fourier coefficients 


Cn = (X, €n) 
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and the Fourier series 


CO 


> Cnen- 


n=0 


The convergence of this series means the existence of an element S € H such that 


An orthonormal system in H is an orthonormal basis (or a Hilbert basis) if every 
element x € H is the sum of its associated Fourier series. This provides an infinite 
dimensional analogue of the natural basis in an Euclidean space. 

The circumstances under which a countable orthonormal family (e,),<n generates 
a Hilbert basis are clarified by Theorem 12.1.8. Its proof needs some preparation. 


12.1.5 Theorem (Pythagorean Identity) /f {xo, x1, ..., Xn} is an orthogonal family 
in a pre-Hilbert space, then 


IIxo +++ + xXnll? = llxoll? ++ +* + Weal’. 
Proof We have 


n 


7 2 n n non 
Ya) =e Vw = LY Yew = Dex) 
k=0 j=0 k=0 


j=0 k=0 j=0 
n 
=> lel 
j=0 


12.1.6 Bessel’s Inequality Let (€n)neN be an orthonormal sequence in the pre- 
Hilbert space H. Then 
CO 
IK, en)? < Mall? 
n=O 


for every x € H. 


Proof For every natural number JN, the vector x — ys (X, €n)en is perpendicular 
to each of the vectors e9,..., ey. Therefore, taking into account the Pythagorean 
Identity, we have 
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N N ? 
[eal = |j* — », (X, €n)en + ‘> (x, €n)en 
n=0 n=0 
N 2 N S 
= x— >) (x, enden er Yee 
n=0 n=0 
N 
> > Itz, en)| 
n=0 
The proof ends by passing to the limit over NV. 
12.1.7. Theorem of Best Approximation Let eo, ..., én be a finite orthonormal 
family in the pre-Hilbert space H and x € H. Then for all numbers ao, ..., Gn, we 


have 


n 
x— D(x, exer < x— > ager 
k=0 


In other words, the best approximation of a vector x with elements in the linear 
span of e9,..., @, is established by the linear combination whose coefficients are 
precisely the Fourier coefficients. 


Proof Indeed, according to the Pythagorean Identity, 


n 2 n n 2 
x— Do acer} = |x — >) (x, exer + D2 (x, ek) — ax) ex 
k=0 k=0 k=0 
n 2 n 
= |x— >) (x, exer] +> Mx, ex) -— axl? 
k=0 k=0 
; f 
> |x — >) (x, exdex 
k=0 


We state now a criterion to decide when an orthonormal family is a Hilbert basis: 


12.1.8 Theorem on Fourier Series Expansion Let (en),¢x be an orthonormal 
sequence in the pre-Hilbert space H. Then, the following assertions are equivalent: 
(a) every x € H is the sum of its associated Fourier series, 


(b) the Parseval identity, 


12.1 Hilbert Spaces: Orthogonal Expansions 443 


[oe 
2 2 
P= > eal 
n=0 


holds for every x € H; 
(c) the smallest linear subspace of H containing all elements e, is dense in H. 


Proof The equivalence of the conditions (a) and (b) follows from the fact that 


N 2 


x — > (X, €n)en 


n=0 


N 
2 2 
= Ix? — 5 Nx, en), 


n=0 


which is a consequence of the Pythagorean Identity. See the proof of Bessel’s Inequal- 
ity. 

The implication (a) = (c) is clear. We will prove that (c) => (a). Fore > 0 
arbitrarily given, there exist numbers ao, ..., ay such that 


N 


x— > akek 


k=0 


<€é. 


Then, by Theorem 12.1.7, 


n 


x Sx, eK) ek 


k=0 


< < <€ 


N 
x— Dot, exer 
k=0 


N 
x— > Akek 
k=0 


for alln > N. This shows that x = S°?°9(x, ex)ex. 


We know that the space C([—7, 7], R) is dense in LR(l-n, 1]). See Exercise 9, 


Sect. 11.8. Thus, for every f € LR ((-7, m™]) and every € > 0, there exists a function 
fe € C([—7, 7], R) such that 


/ (fal Ojide<e. 


We may assume that f-(—7) = f-(7). Indeed, we can modify f: on the intervals 
[—7, —a + 6] and [a — 6, 7] (with 6 > 0 small enough) by replacing the graph of 
f- by the linear segments joining the points (—7, 0) and (—7 + 6, f-(-a + 4)), 
respectively (x — 6, f-(m — 6)) and (, 0). Denoting this new function by f-5, we 
have 
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—1+06 


iz ~ feg(x)[2 dx = / fel) — fos)? dx 


TT 


+ f fet) = fotP ax 


TO 


<2(46 sup |f@)/? )<e, 
xe[—7,7] 


provided that 0 < 6 < «*/(8 SUP; ¢[—z,7] | f-(x)|). According to the Weierstrass 
Approximation Theorem for continuous periodic functions, there exists a trigono- 
metric polynomial P(x) such that 


sup | fes(x) — P@)| <e. 


xe[—1,7] 


Then 


If — Plloz < WS - fellze + Ife — fesllz2 + Wfes — Plizz 
1/2 


<e+e+ fi | fes(x) — P(x)|? dx 


< 26+ eV2n =e (2+V2n), 


which shows that the space of trigonometric polynomials is dense in the Hilbert space 
i ([—7, 7]). Taking into account Theorem 12.1.8, we infer the following important 
result: 


12.1.9 Theorem The normalized real trigonometric system 


1 cosx sinx cos2x sin2x cosnx sinnx 


Je. 2 ae ae a 


is a Hilbert basis for the space Le ({-7, 7]). 

12.1.10 Corollary Every function f € La(-n, 1]) is the sum of its Fourier series 
a wo 
oo 2G cosnx +b, sinnx), (12.2) 


where the coefficients a, and by are given by the formulas 
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Tv 
1 
an=— | f(@t)cosntdt, forn>0 
T 
—T 
and 
T 
1 : 
by = — | f(@)sinntdt, forn> 1. 
7 
=f. 


In this case, Parseval’s Identity has the form 


T 
1! 2 21 
-{ FOP at = 2 + Da + wp. 
—t n=1 
It is worth noticing that the Fourier series expansions mentioned in Corollary 
12.1.10 refers to the convergence in the L*-norm. Thus, denoting the partial sums as 


n 
Sa(fix) = S es Dita chordie. 


we have ||S,(f; x) — f(x)||,2 — 0. A famous result due to Carleson [1] asserts that 
f is also the almost everywhere limit of its Fourier series. 


12.1.11 Remark (The scaled trigonometric system) In the case of the spaces 
Lp(a, a+ 2T]) (where a € R and T is a positive number), the role of the nor- 
malized real trigonometric system is played by the orthonormal system 


1 1 TX 1 . wx 1 27x 1 |. 21x (12.3) 
: cos —, sin —, cos ; sin ee : 
ie ia iar \ i oa ee aes eee 
notice that this system is independent of a. The Fourier trigonometric series associ- 
ated to a function f € LR(I-T, T]) has the form 


CO 


ray (« cos = 4+ Bp sin —) (12.4) 


where the coefficients a, and b, are defined by the formulas 


T 
as bea at, Hoen sO 
an =F i ( cos , forn> 
-T 


and 
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T 
b =f tos PE ay. tore 
=— sin —dt, forn> 1. 
aie (i 
—T 


In the case of a function f € Ve p(la,a+ 2T)]), the interval of integration should be 
changed to [a,a + 2T]. 

Clearly, if f is an odd function, then all coefficients a, vanish and the associated 
Fourier series is a series of sines. When f is even, then all coefficients b, vanish and 
the Fourier series is a series of cosines. 

The argument of Theorem 12.1.8 can be adapted easily to show that the sys- 
tem (12.3) is a Hilbert basis in each of the spaces Li (la, a + 2T]). Thus, Corol- 
lary 12.1.10 extends to this context (with the only obvious change concerning the 
interval of integration). 


12.1.12 Remark (The complex trigonometric system) The normalized complex 
trigonometric system, 


Xn(X) = Jan’ 


constitutes a Hilbert base in L?({[—7, 7]). See the argument of Theorem 12.1.8. The 
same is true for all spaces L*({a,a + 2r)]). 


The Fourier trigonometric series associated to a function f € L?({—17, 1) is by 
definition the series 


CO 


> a): (12.5) 


n=—CO 


where the coefficients cy are given by the formula 
1 T 
= — (ne dt, neZ; 
V2T / 
=. 


since their definition resembles the usual Fourier transform, they will be denoted 
also f(n). 
The partial sums of the series (12.5) are defined by the formula 


Sn( fs x) = 


> cpel* neN. 
° 1 ps —n 
As in the real case, every function f € L?((—n, 7)) is the sum (in the sense of 
L?-norm) of its Fourier trigonometric series, that is, || f(x) — Sp(f; x)||,2 > Oas 
n — oo. Moreover, 
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[isa = Slew. 


neZ 


In the case of the space L? ((—T, T)), the analogue of the above theory applies 
to the scaled trigonometric system, 


nny. 
ix 


Jar 


All considerations that made the object of Remark 12.1.11, can be adapted to the 
context of complex-valued functions. 

We can associate a Fourier series of the form (12.2) to every function f in 
L! ((—a, 7)) since the formulas for the coefficients still work in that case. How- 
ever, a result of the type 


neZ. 


| f(x) = SiC fs Ollzt > 0 


does not work in general. What can be said at this degree of generality is the fact 
that the Fourier coefficients of every function f € L! ((—7, 7)) tend to zero, that is, 
) ras f(n) = 0. This results from the following result: 

n|—>> co 


12.1.13 Riemann-Lebesgue Lemma Let f: [a,b] — C be a Lebesgue integrable 
function. Then for every  € R\ {0}, 


b 
lim / f(xjeO"* dx = 0. 
n—->oo 
a 


The proof is similar to that of Lemma 11.9.1. 
Exercises 


1. (Jordan-von Neumann Theorem). Let H be a normed linear space for which 
the parallelogram identity holds true. Prove that its norm is the Hilbert norm 
associated to the scalar product (-, -) defined by the identity of polarization. 
[Hint: Notice that 


(x + y, 2) + (x — y, 2) = 2(x, 2) 


x+y 


(x, 2) + (y, 2) = 2¢ »2) = (x+y, 2).] 


2. Consider the function f defined on [—7z, 7] by f(x) = |x|, and extended (by 
periodicity) to R as a function of period 27. 
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(a) Sketch its graph. 
(b) Compute the Fourier series associated to the function /f. 
(c) Use Parseval’s Identity to show that 


0° 4 
D Gren T 
— (2n+ I 96" 


. (a) Verify that the system of functions 


1 1 TX 1. Wx 1 27x 1. 21x 
, cos , sin , cos ; 
J2T JT T’ JT T VT EO fp T 


is orthonormal in LR(-T, T]). 


(b) Prove that this system is a Hilbert basis for ie p(l-T, T)). 
(c) Write down Parseval’s Identity in the case of this i 


. Consider the periodic function f, of period 1, given by f(x) = x for x € [0, 1) 


and f (1) = 0. 

(a) Sketch its graph. 

(b) Compute its Fourier trigonometric series. 

(c) What becomes in this case Parseval’s Identity? 

Let a € R\Z, and consider the function f(x) = e!@ for x € (—7, 7). Infer 
from the Parseval identity that 


(oe) 


> 1 = 7 
(a—n)z ee) 


sin” av 
n=—C 


. (Cantor’s uniqueness theorem). Suppose that f € i? ({—7, 7]) and 


» f(nyei”™ =0 


n=—CO 


for every x. Prove that all coefficients f (n) vanish following the next steps: 
(a) Consider the auxiliary function 


y= ems foe 
neo EN) 


and show that 
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PO) nx, FO ine] sup {| F0| + [f-b] kN] 


(in)? (in * ~ n? 


for alln # 0. Apply the Weierstrass M-test to obtain the uniform convergence 
of the series defining F(x) and the continuity of this function. 
(b) Prove that the symmetric upper derivative of order two of F vanishes, 


F(x +h) —2F Fa—h 
ince ‘Jhes eel ee ee 
h->0 h2 


(c) Infer from (b) that F(x) is an affine function. 
(d) The left-hand side of the equation Dae me el" —=mx+n— f (0) - isa 
bounded function. Prove that f (0) = Oandm = 0. 
(e) Observe that f(n) = oy J, Sw(f; xye7'"* dx for all N > |n| and con- 
clude that f(n) = 0 for all n. 

7. (The Gram-Schmidt orthogonalization algorithm). Let (x,)_, be a finite family 
of linear independent vectors in the pre-Hilbert space H. Prove that the family 


vo = Xo 
k-1 
Xk, V 
UK = Xk ps i jj, K=l,...,n, 
jay jr Ys) 


is an orthogonal family such that 


span {vo,..., Un} = span {xg,...,Xy}. 


(oe) 


. Let H bea separable Hilbert space. Prove that H has a Hilbert basis. 
9, Prove that the sequence space ¢7(N) is a Hilbert space. 

10. Let (é€n)nen be a Hilbert basis of the space L?({—7, 7]). Prove that the function 
T: L2({—7, t]) > @(N), given by T(x) = ((x, én)),,. iS an isomorphism of 
linear spaces and an isometry. 

11. Let J be an open interval. The Sobolev space W!:?(/) consists of all functions 

fe L?(1) which admit a generalized derivative Df € L?(1) in the sense that 


7 f (x)y' (x)dx = -{ Df (x)p(x)dx forall y € C°(/). 
I I 


Prove that W!:*(/) is a Hilbert space with respect to the norm associated to the 
scalar product 
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(f, wi2 =(f g)n2 + (Df, Dg)12- 


12. Prove that for every f ¢ W!:*(/), there exists a continuous function g: | > C 
such that f = g almost everywhere. 


12.2 Pointwise Convergence of Fourier Series 


Theorem 12.1.9 (together with Corollary 12.1.10 and Remark 12.1.11) solved the 
problem of Fourier series expansion within the class of square integrable functions, 
making use of the convergence in the L?-norm. The aim of this section is to discuss 
under which conditions the Fourier series of a function converges pointwise to that 
function. 

We start by mentioning that the Fourier series of a continuous function need not 
to converge pointwise. Even more, using arguments invoking the Baire category 
theorem, one can show that the family of continuous functions whose Fourier series 
converges at a given xo is of first Baire category, in the Banach space of continuous 
functions on the unit circle (which coincides with the space of all continuous periodic 
functions of period 277). So, in some sense, pointwise convergence is atypical, and 
for most continuous functions the Fourier series does not converge at a given point. 
See Bhatia [2], p. 39. However, according to a result due to Carleson [1], every 
continuous periodic function is the almost everywhere limit of its Fourier series. 

Exercise 13 provides a sketch of Fejér’s Approximation Theorem, which asserts 
that every continuous function f : [—7, 7] > Cwith f(—7) = f(z) is the uniform 
limit of the arithmetic means of the partial sums of its Fourier series. 

In what follows, we will describe several instances when the Fourier series of a 
function f converges pointwise (or even uniformly) to the function /f. 


12.2.1 Theorem Let f : R — C be a continuous periodic function, of period 27, 
whose complex Fourier series is uniformly convergent. Then, the sum of this series 
is precisely the function f. 


The real variant of this result is also true, with a similar argument. 


Proof Let g(x) be the sum of the series >", f (n)Xn(x). The function g is continuous 
and periodic, of period 277. See Theorem 6.8.3. Its Fourier coefficients are identical 
with those of the function f. Indeed, 


§(2) = (9, Xn) = p3 FOX: w| 


keZ 


= >> fMlxe. xn) = fo. 


keZ 
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2 


Fig. 12.1 The extension of YA 
a by periodicity 


Thus, f — g is acontinuous periodic function whose Fourier coefficients are all zero. 
By Parseval’s Identity, it follows that f — g = 0 almost everywhere. Since f — g is 
a continuous function, this forces f — g = 0 at all points of R. 


Let f be the continuous periodic function of period 27, defined by f(x) = x? 
for x € [—7, 7]. See Fig. 12.1. 
Since f is an even function, its Fourier series is a series of cosines, precisely, 


2 60 n 
T (—1)" cosnx 
3 oe a ee 


n=1 


This series verifies the hypotheses of Weierstrass’ M-Test of uniform convergence 


2g a n o . J 
since |“) cosnx | < 1 for alln > 1 andx € R, and the series >, - is convergent. 
n ne a 


Thus, Theorem 12.2.1 applies and gives us the following Fourier series expansion: 


Pe =. (= 1)" cos nx 
3 2 for every x € [—7, 7]. 


n=! 
For x = 0, we infer the formula 


oo (—1)"t1 nr 


ae AD. 

In general, it is difficult to check directly the uniform convergence of a series, 
but under the presence of differentiability (or of some substitutes of it) one can 
state much more practical criteria of Fourier series expansion. A C! companion of 
Theorem 12.2.1 is as follows. 


12.2.2 Proposition Suppose that f : R > C is aC! periodic function, of period 
2m. Then its Fourier series is uniformly and absolutely convergent to f. 
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Proof Clearly, f’ is continuous periodic function, of period 277. We have 
T 
fn) == u fe dt 
V2 
—T 


—int 1 / / —int q 
f@e : ik. f' Me t 


in 20 
-~j~ 
=—fin) 
n 
for all n 4 0. Then, 
fon] <5 (= + on) 
~ 2\n? 


and by Bessel’s Inequality, 


“Fm [iv (t)|° de. 


neZ 


< oo and Weierstrass’ M-Test applies. 


Therefore >° <7 | f (n) 


Surprisingly, the convergence at a given point x of a Fourier series depends only 
on the behavior of the function that generated the series on an arbitrarily small 
neighborhood of x. 

To prove this, we need to remark that the partial sums S,(f; x) of the Fourier 
series associated to a 27-periodic function f can be computed via an analogue of 
the product of convolution. Indeed, 


1 Ws ; 1 <" us 
Sn > a k ikx = — / —ikt d ikx 
(i= a park ei = x | fe at) e 


i frw(h i) dt 
k=—n 


ft — t)Dy(t) dt, 


where the functions D, (ft) (called Dirichlet kernels) are defined by the formulas 
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n 


1 ’ 
D,(t) = = > el neN. 


k=—n 
A straightforward computation yields 


- (n+l1)t 
sin a 


D,(t) = ——=—.. 
nf?) 2n sin § 

12.2.3 Theorem (Riemann’s Localization Principle) Suppose that f : R—> Cisa 
periodic function, of period 27, which is Lebesgue integrable on every interval of 
length 27. Then the convergence or divergence of the Fourier series at a point x, 
and the value of the sum when it converges, depend only on the behavior of f in an 


arbitrarily small neighborhood of x. 


Proof Indeed, for every 6 € (0, 7), we have 


eee / Fx —1)D, (1) dt 


6 


= f fa -nd, nar +2 Le eS) sin (n+ 5) rar 
6 


sin 5 
6 


= -\—l 
Since (sin 5) < (sin 3) for 6 < t < a, from Riemann-Lebesgue Lemma, 
we infer that the last integral tends to 0 as n — oo. Therefore, the behavior of 
Sn(f; x) whenn — oo is perfectly determined by the values of f in a neighborhood 


(x — 6, x + 6) of x (no matter how small is 6). 


The next result concerns the convergence of the Fourier series at a given point. 


12.2.4 Theorem Suppose that f : R — C is a periodic function, of period 27, 
which is Lebesgue integrable on every interval of length 27. Then, if f is differentiable 
at a point xo, the partial sums S,(f; x9) converge to f (xo) asn > oo. 


Proof Without loss of generality we may assume that x9 = 0 and f(xo) = 0; for 
this, replace f by the function f(x +x9) — f (xo) if necessary. The auxiliary function 


ene — if x € [—m, r]\ {0} 


—if'(0) ifx=0 


is integrable and g(—7) = g (7), so it can be extended by periodicity to R. Since 
Ff) = g(x) (e’* - 1), an easy computation yields the formula 


f®=G§k-)-G®. 
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Then 


1 n m A = | _A 
(F:0) == > fo = si > me 
k=—n 


converges to f (0) = 0, according to Riemann-Lebesgue Lemma 11.9.1. 


We will state now a criterion of convergence that allows jump discontinuities 
(and covers almost all important applications of Fourier series). 


12.2.5 Dirichlet’s Pointwise Convergence Theorem Let f : R > C be a peri- 
odic function, of period 27, which is Lebesgue integrable on every interval of length 
2m. If f admits at the point xo a discontinuity of the first kind and there exist and are 
finite the limits 


_. f%o—h)— f(xo-) 
m 


if 
==] 
eo 


and 


f (xo +h) — f (xo+) 
; ; 


/ — ii 
f (ot) poe 


fort ot) 


then the partial sums S,(f ; xo) converge to asn—> Oo. 


Proof Indeed, since the Dirichlet kernels are even functions, we have 


oe / f (xo — Dj (t) dt 


t) 
D, (t) dt 


=f Menino 
2 


= Sn(g; 9), 


where g is the function defined by the formula 


Gomis fort) if te [—7 nr] \ {0} 
g(t) — Loon VFLo4) if r—0. 


The function g is differentiable at t = 0 and thus, from Theorem 12.2.4 we infer that 


Sng; 0) > fo7) 5 fot) asn —> ©o. 


For applications, see Exercise 1(b) and (c), Exercise 3 and Exercise 6. 
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Exercises 


1. Find the Fourier series expansions of the functions: 
(a) f(x) = sin? x: (b) f(x) = |cosx|; (c) f(x) = 7 — |x| for x € [—7, 7]. 


2. Prove the formula 


(w—x)? 7? + oe f (0, 27] 
A =r ae or every x € |0, 27]. 


n=1 


3. The Theorem 8.6.2 of differentiation term-by-term of a series of functions does 
not apply in the case of the series in the preceding exercise. However, prove that 


CO . 
: - a > == for every x € (0, 27). 


n=1 


What is happening at the points 0 and 27? 

4. Prove that the series appearing in Exercise 2 can be integrated term-by-term and 
write down the new series obtained. 

5. By Proposition 12.2.2, the Fourier coefficients of every 27-periodic function 
f € C!(R) verify the condition pane fo)| < OO. 
(a) Prove that if f € ct (R) (with k > 1) is a 27-periodic function, then 
Lnez*!]f00)| <0. 
(b) Infer that in the case where f € C KR, IR), this conclusion becomes 
Ynez 2! (lanl + [Bnl) < 00. 

6. Suppose that the Fourier series of a 27-periodic function f € C(R, R) verifies 
the condition pi 12(lan| + |bn|) < oo. Prove that the function f is of class 
C! and its Fourier series can be differentiated term-by-term. Illustrate this result 
by an example. 

7. The Fourier coefficients c, of a periodic function f € C°(R) are rapidly con- 
verging to zero. Prove that for every natural number JN, there exists a constant 
Cy such that 


-N/2 
Icn| < Cn (1 + in'?) for every n € Z. 


Therefore, the Fourier series of f and all differentiated series converge uniformly 
on R. 

8. (The Poisson Summation Formula). Let y be a function in the Schwartz space 
S(R). Show that 


456 


10. 


11. 


12. 
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> eQnr) = >° Gn). 


neZ neZ 


Then, consider the particular case where p(x) = e-" anda > Oisa parameter, 
to prove that the function, 0(x) = 3° 5 enn, verifies the functional equation 
O(a /x) = ./x O(rx) for all x > 0. 

[Hint: Use the periodic function of period 27, ®(x) = > ,e7, Y(x + 2nT).] 
Note The function @(x) can be used to prove that the analytic continuation of the 
zeta function verifies the functional equation 


n/p (5) C(2) = re-D2p (=) ¢(1—2), 


a fact which led Riemann to formulate in 1859 a famous conjecture 
(still unsolved) on the zeroes of the zeta function. 


. Prove the Fourier series expansion 


(a) Prove the formula 


2asinra {1 ee aye 
cosax = 7 (5 + oy mee cos nx 


n=1 


for every x € [—7, 7] anda € (—1, 1). 
(b) Infer that cot ra— + = 2a pee a 1 =z and integrate this equation side-by- 


TO 
side from 0 to ¢ in order to obtain Euler’s infinite product for the sine function: 


‘ eo) 2 
sin 7t t 
— (1-5) fort € (-1, 1). 
n 


Tt 


n=1 


Use the Maclaurin expansion of the function e* to prove the following Fourier 
series expansions: 


oe) CO . 
cos nt sinnt oes 
> = e°' cos(sint) and > = e©! sin(sint). 


n! n!} 
n=0 n=1 


(The Fejér kernels). These kernels are defined by averaging the sequence of 
Dirichlet kernels, 
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13; 


n-1 


F(t) = — > Del) forn = 1,2,3,.... 
k=0 


Prove that: 


sin fae 1 a nt 2 
(a) Dy(t) = — and F,(t) = —— 2) 4 
2 


27 sin 27n \ sin 5 


(b) the functions F;,(t) are even, positive, and continuous; 
Tv 
(c) f Fr(dt = 1; 


Tv 
(d) for every 6 > 0, we have 


2 
1 1 
0= F,0) = >—| 3 for 6 < |x| <7. 


~ 2Q7n sin z 
(Fejér’s Approximation Theorem). Let (F;, (t)),, be the sequence of Fejér kernels. 


Prove that every continuous function f : [—7,7] > C with f(—7) = f(z) 
has the property 


/ f(x —1t)F,(t) dt > f(x) uniformly on [—7, 7]. 


[Hint: Adapt the argument of Theorem 1 1.7.6c.] 
Note According to the definition of Fejér kernels, 


So(fs x) + SiC fs x) +++ + Sn-1Cf; x) 


n 


} f@ —1t)F,@) dt = 


so Fejér’s Approximation Theorem asserts the uniform convergence of the arith- 
metic means of the sequence of partial sums of the Fourier series of f. This 
provides an alternative proof of the Weierstrass Approximation Theorem for 
periodic functions. 


12.3 The Motion of Vibrating String 


In the vertical plane x Oy consider a string whose endpoints are (0, 0) and (€, 0). 
The points of the string vibrate in the vertical plane as a result of string tension. The 
motion is described by a function u = u(x, t) that measures the vertical displacement 
from equilibrium of the particle at horizontal position x and at time f. 
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By mechanical reasons (see Vladimirov [3], pp. 27—29 for full details), the equa- 
tion of motion for small oscillations of a frictionless string is 


ye oe for x € (0, £) andr > 0, (12.6) 
where a is the square root of the ratio between the density of the string and the 
magnitude of tension force. This equation is known as the one-dimensional wave 
equation, or d’Alembert equation. To the Eq. (12.6) one should add the boundary 
conditions, 


u(0, t) = u(£, t) = 0 for allt > 0, (12.7) 


and the initial conditions, prescribing the initial shape and the initial speed of the 
string, 


u(x, 0) = u(x) and mC, 0) = uy (x) for all x € [0, €], (12.8) 


to arrive at what is usually called the mixed problem for the one-dimensional wave 
equation. 

We are looking for the existence of the classical solution of the one-dimensional 
wave Eq. (12.6) which verifies the boundary conditions (12.7) and the initial condi- 
tions (12.8). This means a function belonging to 


C* ((0, £) x (0, 00), R) NC! ({0, 2] x [0, co), R). 


The uniqueness of this solution (when it exists) makes the objective 
of Exercise 3. 

We will next discuss the existence, by assuming that ug € C*({0, £]), ur € 
C!({0, é]) and 


ug(0) = ug(€) = 0 and u, (0) = u4(£) = 0. (12.9) 


The method (known as the separation of variables) consists of three steps. The 
first two steps are merely formal. They are intended to single out a possible solution. 
The fact that the proposed function is indeed a solution constitutes the objective of 
the third step. 

Step 1: Search for nonzero solutions of (12.6) and (12.7) having the form u = 
X(x)T(t). Substituting u in (12.6), we obtain X(x)T"(t) = a? X""(x)T (t), whence 


T(t) 2 X"(x) 
aT(t)  X(x)° 
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Since the left-hand side does not depend on x and the right-hand side does not depend 
on t, we infer that the two ratios equal a constant —A. 
The problem is now to determine the values of A for which the problem 


X"(x) + AX (x) = 0 
X(0)= X(l)=0 


admits nonzero solutions. According to Exercise 7, Sect.8.2, we have to consider 
three cases: \ < 0, \ = O and X > 0. In the first case the equation has the general 
solution X (x) = Ae~* YX 4. Bet and the boundary condition impose A = B = 
0. The conclusion is the same when = 0. If \ > 0, then the general solution of 
the equation is 


X(x) = AcosV\x + BsinV x. 


The condition X(0) = O imposes A = 0. Thus, to save B 4 0 and to assure that 
X(€) = 0, it is necessary that sin /\€ = 0. This is possible for \ belonging to the 
sequence 


nt \2 
m=(F) . n= 1, 2,3. <i0 


For each n, we obtain a solution X,,(x) = sin “*. The equation 


ip nida\2 
T(t) + (=) rit 
has the solutions 
nia _ nia 
T, (t) = Ay cos 7 a + B, sin oF 


where A, and B, are arbitrary constants. 
The functions 


nna _nma\ . nt 

Wr(X, t) = (An cos 7 ae + B, sin ) sin hal (12.10) 

correspond to the modes of vibration of the string and they provide solutions for 
(12.6) & (12.7). 

Step 2: Proposing a solution for the mixed problem. 

Since every linear combination of solutions of (12.6) & (12.7) is still a solution, 
we search for a solution u of the whole problems (12.6) & (12.7) & (12.8) of the 
form u = yi CnWn (where the c,,’s are real numbers); in physics this is called the 
Principle of superposition. Thus, we have in mind functions u of the form 
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ee) 


fee > (Ar cos t+ By sin =) sin (12.11) 


n=1 


The values of the coefficients A, and B, are uniquely determined by the initial 
conditions. For tf = 0, we obtain 


ie _ nt 
>) An sin 7-2 = u(x), (12.12) 


n=1 


and allowing term-by-term differentiation with respect to f in (12.11), we get 


= nya nN 
> Bn sin =x = ui (a). (12.13) 


n=1 


Due to the condition (12.9), the functions ug and uw; can be extended to [—£, £] as 
odd functions and then to R, by periodicity. Taking into account Proposition 12.2.2 
and Remark 12.1.11, we conclude that both Fourier series expansions (12.12) and 
(12.13) are perfectly valid and their coefficients are given by the formulas 


£ 
2 ont 
An = 7 ug(x) sin Pe dx = ay, 
0 


and 


£ 


nna» al es nt d 8 
—B,=-— sin — = Pn. 
me uy(x 2 x dx n 
0 
Thus, formula (12.11) becomes 
as nna £ nra nt 
w= >i (a cos t+ —— fh sin ; ) sin a (12.14) 


n=1 


Step 3: The formula (12.14) provides a solution for the mixed problem of the 
one-dimensional wave equation. 

Since ug is of class C? and 1 is of class C!, their coefficients verify the conditions 
yee nlan| < co and °°, |B,| < 00, respectively. See Exercise 5, Sect. 12.2. 
Then, the absolute value of the general term of the series (12.14) does not exceed 
|an| + |G,n| and Weierstrass’ M-Test applies. This fact assures the continuity of w. 

We next show that lim;.9 %(x,1) = %(x,0) = u(x). This follows from 
Theorem 8.6.2 (on differentiation of series of functions) if we prove that the series 
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> nia . Male a L 8 Oe _ nt 
Qy sin cos ——t } sin —x, 
L " L nna” L L 


n=1 


obtained from (12.14) by term-by-term differentiation with respect to ¢ is uniformly 
convergent. This is indeed the case, since 


nia . nna 
2 Qp Sin 2 t+ 


nta _ nT 
By COS t } sin —x 
£ £ 


nia 
< lanl + |Gnl, 
nia £ 


and Weierstrass’ M-Test applies. 

When uo is of class C 3 and w is of class C2, we may continue in the same manner 
to prove the possibility to differentiate twice, term-by-term, the series (12.14) with 
respect to x and f, respectively. This assures the existence and continuity of the 
derivatives ay and ry 
equation). 

However, a different approach based on d’ Alembert Formula shows that the class 
C? for ug and the class C! for w1 suffice as well. Indeed, according to 12.12, 


(as well as the fact that u verifies the one-dimensional wave 


= nt nia c— nt 1— nt 
Dee 7 ros 2 Sg neta aa Deane ee) 
n= n= n= 


ug(x — at) + ug(x + at) 
2 + J 


and a similar argument shows that 


x8 FL Le Laem uy(x — at) + uy(x + at) 
eo ue 2 


n=l 
Integrating the last equation side-by-side (with respect to t), we obtain 


oO t t 


L nt . nta 1 1 
> By sin —x sin t= uy(x —atr)dt7 + = | uy(x +ar7)dt 
nta £L L 2 2 
n=1 0 0 
x-+at 
=— dr. 
= / u(r) dr 
x—at 


In conclusion, the solution of the mixed problem can be put in compact form by the 
d’Alembert’s Formula, 


1 Xx+at 

= <a t 

u(x — at) + uo(x + at) 4 / rae 
2 2a 


x—at 


u(x,t) = 
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See also Exercise 5, Sect. 9.4. 
Other applications of the method of separation of variables for solving specific 
problems of mathematical physics can be found in the book of Vladimirov [3]. 


Exercises 


1. Solve the mixed problem 


2 
a =! oo for x € (0, £) andt > 0 


u(0,t) =u(é,t) =O fort > 0 


u(x, 0) =uo(x) and mex, 0) = u, (x) for x € [0, €] 


in each of the following cases: 


(a) ug(x) =O anduj;(x) = 1; 
(b) up(x) = 1 andu,(x) = 0; 
(b) uo(x) =x anduy(x) =x’. 

2. Solve the mixed problem for the one-dimensional wave equation: 


Ou _ O*u 
Ot? Ox? 
u(0,t)=u(a,t)=0 fort >0 


+cost forx € (0,7) andt > 0, 


Ou 
u(x, 0) = ao =0 forx € [0,7]. 


3. Consider a function 
u = u(x,t) € C? ((0, £) x (0, 00), R) NC! ((0, 4] x [0, 00), R) 


such that 


=a for x € (0, 2) andt > 0 


> Oru 
Ox? 
u(0, t) =u(é,t) =O fort > 0 


u(x, 0) = 0 and me, 0) = Ofor x € [0, £]. 


Prove that the integral 
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BG all ; > (Ou ‘ du\? d 
o=5/ (a) +) . 
0 


called the energy of u, verifies ¢ E(t) = 0 and conclude that u(x, t) = 0 for all 
x and ¢. 


12.4 Notes and Remarks 


The problem of representing a function with period 27 as the sum of a trigonometric 
series of the form 


+ >. (2100s + by sin) 


n=1 


has its origin in the works of Daniel Bernoulli, Jean Le Rond D’ Alembert, Joseph- 
Louis Lagrange and Leonhard Euler. 

Jean Baptiste Joseph Fourier, in his book Theorie Analytique de la Chaleur 
(1822), has considered many particular instances of such representations. However, 
as noticed Enrique A. Gonzalez-Velasco [4], “the methods that Fourier used to deal 
with heat problems were those of a true pioneer because he had to work with con- 
cepts that were not yet properly formulated. He worked with discontinuous functions 
when others dealt with continuous ones, used integral as an area when integral as an 
antiderivative was popular, and talked about the convergence of a series of functions 
before there was a definition of convergence.” The rigorous study of these series 
began with Johann Peter Gustav Lejeune Dirichlet (1829), who was able to prove the 
Fourier series expansion of the piecewise smooth functions. His test of convergence 
was later generalized by Marie Ennemond Camille Jordan (1881): 


12.4.1 Theorem (Jordan’s Test [14]) Suppose that f : R — Cis aperiodic function 
of period 27 that is integrable on [—1, 7] and has bounded variation in a neighbor- 
hood of xo. Then, its partial Fourier sums S,(f; x) = Tr aaa f (kei verify 
at xo the formula 


oR ee F(to—H + fo +O) 
noo 2 

See the classical monograph of Antoni Zygmund [5], Theorem II.8.14. 

If the hypothesis on the bounded variation is dropped, then a result due to Henri 
Léon Lebesgue [15] asserts that at every point xo at which the limits f (x9 — 0) and 
Ff (xo + 0) exist (and, if both are infinite, they are of the same sign), we have 
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So( fs x0) + Si Cf; x0) +--+ + Sn—-1(f3 Xo) : f (xo — 0) + f (xo + 0) 
n 2 . 


See [5], Theorem III.3.4. The proof of Theorem 12.2.4 given in the text is due to 
Paul R. Chernoff [13]. 

On neighborhoods of a point of discontinuity xo is present the Gibbs phenom- 
enon, which is a feature of the nonuniformity of the convergence of the sequence 
(Sn (f; X0))n. Consider, for example, the Fourier series expansion 


A & sin(2k — 1)x 
el?» ‘keo1] x € (-7,7). 


Its partial sums, 


n 


4 sin(2k — 1)x 4 [* 
Sn(f3 x) = », = [cost + cos 3t +--+ + cos(2n — 1)t] dt 
TT T JO 


k=1 2k—1 
x 
2 / sin 2nt 
= : dt, 
T sin t 
0 


attain their maximum at x = a and 


. T 2 sin t 
lim S, (7 = af ela, 


n> 00 On 


which represents a deviation of circa 18 % from lim,_,94 sgnx = 1. See the mono- 
graph of Zygmund [5] for additional information on this phenomenon. 

In engineering, a signal is any Lebesgue integrable function f : R > C. A signal 
is called band limited if there exists a number Q > 0 (called the band width) such 
that 


f (w) = 0 for every w € R\ [-, Q]. 


Such a signal is then given by 


Q 
rea = i Fe dw. 


Ifa signal f is sampled at regular intervals of time and at a rate sufficiently higher, 
then the samples contain all the information of the original signal: 


12.4.2 Sampling Theorem /f f is a band limited signal with band width Q and 
L < 1/Q, then f is reconstructed from its values sampled at the times {nL :n € Z} 
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as the sum of a convergent series, 


sin (4) 
fa =>) faly—LE. 
Ds 7 (aE) 


neZ L 


A proof of this result based on Poisson Summability Formula can be found in the 
paper of Boas [6]. Steiner [7] went further, finding a link with Plancherel’s Theorem. 

More about the history and importance of the Sampling Theorem can be found in 
the paper of Higgins [8]. 

A nice introduction to the subject of Fourier series is offered by the book of 
Rajendra Bhatia [9]. As concerns the present state of art, the reader is kindly invited 
to consult the two volumes of Loukas Grafakos on Fourier Analysis [10, 11]. 

A search on Internet reveals numerous deep applications of Fourier series in 
electrical engineering, vibration analysis, acoustics, optics, signal processing, image 
processing, quantum mechanics, econometrics, etc. 

During the last three decades, an important variation on Fourier series, replacing 
the sine and cosine functions with new families of functions called wavelets, was 
intensively developed. Details can be found in the book of Kaiser [12]. 
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Appendix A 
An Introduction to One-Dimensional Dynamics 


Many practical problems such as the population evolution, weather forecasting, 
climate modeling, planetary stability etc. motivate the importance of studying the 
theoretical and computational aspects of asymptotic behavior of the iterates of a func- 
tion. The aim of this appendix is to offer an idea on the complexity of this subject. 


A.1 Discrete Dynamical Systems 


A discrete dynamical system on a metric space M is the sequence (f”),, of iterates of 
a function f : M — M, that is, 


fro=idy and f"=fofo---of for n>1. 
e+>S— 


n times 


We will denote by (M, f) the dynamical system generated by f. 

The main problem of interest in connection with a dynamical system is the asymp- 
totic behavior of its trajectories. For each point a of M, the trajectory of a is the 
sequence 


xo =a and x, =f(%n-1) for n> 1, 


which is nothing but the sequence of iterates of f computed at a. The values of this 
sequence constitutes the orbit O(a) of the point a. 

The problem of understanding the structure of orbits turns out to be very complex. 

Ifa € M and f"(a) = a for some integer n > 1, then a is called a periodic point 
of f (of period n). The smallest such n is the prime period of a. The orbit of a periodic 
point is called a periodic orbit. It can consist of only one point if a is a fixed point of 
f, that is if f(a) = a. 

For the identity function of R, all the points are fixed points. For the function 
f(x) = —x, there is only one fixed point, the origin, all the others being periodic 
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of period 2. An interesting fact concerning the case where M is an interval, is that 
the existence of a periodic point of prime period 3 implies the existence of periodic 
points of any other prime period. See Exercise 6. 

The fixed points and the periodic orbits are examples of invariant sets. A subset 
A of M is called invariant under f if f(A) C A. Intuitively, this means that once a 
trajectory of the system enters A, it will never leave it again. 

When M is an interval, the graphical analysis (or the cobweb plot) may have an 
important role in understanding the dynamics of a function. A simple illustration of 
how it works is shown in Fig. A.1. 

The intersection of the graph of f with the line y = x gives us the fixed points. The 
orbit of a point a can be seen in the following way: draw a vertical line through a. It 
will cross the graph at (a, f(a)). The horizontal line through this point will intersect 
the line y = x at (f(a), f(a)). The vertical line through this point will cross the graph 
at (f(a), f ?(a)) and so on. On the picture, the arrows show the direction of motion 
of iterates. 


A.1.1 Example (a) Let f :[—1, 00) > [-1, ©), f(x) = V1+-~. This function has 
a unique fixed point, p = Eee The analysis of the graph shows that if x9 < p then 
(f” (x0))n iS a Sequence increasing to p while if x9 > p, then (f”(x9))n is a Sequence 
decreasing to p. See Fig. A.2. 


Fig. A.l_ The graphical 
analysis 


Fig. A.2. Dynamics of » 
f(x) = JI +x Y 
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Fig. A.3 Dynamics of f, (x) = Ax 


(b) Let ff, : R > R, fi, &) = Ax. The origin is the only fixed point for f,, unless 
A = 1. IfA changes a little bit, then the dynamic of the points close to 0 can change 
considerably. See Fig. A.3. We call such a value of A a value of bifurcation for a 
given family of functions f,.. 

If A = 0, then f” (xo) = 0 for every xo and every n > 1. 

If 7 € (0, 1), then all trajectories are converging to 0 (increasing, if x9 < 0 and 
decreasing if xo > 0). 

If A = 1, all the trajectories are constant. 

If A > 1, then for x) > 0, the trajectory increases to oo and for x9 < 0, the 
trajectory decreases to —oo. 

If A < —1, then all trajectories are going away from the origin. 

If A = —1, then every point is periodic of period 2. 

If A € (—1, 0), then all trajectories are converging to 0. 


The above examples show that there are many types of fixed points. 

A fixed point p is called an attractor (for the dynamical system generated by 
f :M — M) if there is an open neighborhood U of p in M such that f”(x) — p for 
every x € U. The set U is called the basin of attraction of f. If we can take U = M, 
then p is said to be a global attractor. 

A fixed point p is said to be a repellor if there is a neighborhood U of p such that 
for every x € U, there is n with f”(x) ¢ U. Notice that in a case like this, it may be 
possible that for some m > n, f’"(x) € U. 

There are indifferent fixed points, which are neither attractive nor repelling. For 
example, the origin for the identity function on R. 

The above terminology extends verbatim to periodic orbits. For example, a peri- 
odic orbit O(a) is an attractor, if there is an open set U that contains O(a) such that 
d(f"(x), O(a)) — 0 for every x € U. 

An important source of global attractors is the Contraction Principle. Recall that 
a function f, from the metric space M to itself, is a contraction if there is C € [0, 1) 
such that 


d(f(x),fO)) < Cd, y) 
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for every x,y € M. According to the Mean Value Theorem, a differentiable 
function f mapping an interval J into itself is a contraction if and only if C = 
sup {|f’(x)| 1x eE I} <i. 


A.1.2. The Banach Contraction Principle Let M be a complete metric space and 
f :M — M be a contraction. Then, f has a unique fixed point p, which is a global 
attractor. 


Moreover, the point p can be found by the method of successive approximations: 
starting with the initial approximation xo, the sequence 


Xn41=fOn) for neN, 


(usually known as the sequence of successive approximations) is always converging 
to p. 


Proof Let x9 € M arbitrarily fixed and x, = f”(xo) forn > 1. Then 
AXn+1, Xn) = AF An), f An-1)) < Cdn, Xn-1) 
and, using mathematical induction, we obtain that 
d(Xn41,Xn) < C"d(x1, X0). 
Taking into account the triangle inequality, we infer that 


A(Xprks Xn) < AXnrks Xntk—-V) + dOnrk—1, Xn¢k—-2) $+ + dXn41, Xn) 
= CF daa: Xn) + Cd ai Xn) ae ap d(Xn+1; Xn) 
= (Ch1 4087 +... 4 1) dQn41, Xn) 


n 


1 
< —, d(Xn41,%n) < d(x1, Xo), 


~1-C ~1-C 
for all k, n € N. Since the last term converges to 0 as n tends to oo, we conclude that 
(Xn)n is a Cauchy sequence in M and thus, a convergent one, because the space M is 
complete. Let p be the limit of the sequence. By passing to the limit in the formula 
Xn41 = f(Xn) we obtain that p = f(p), that is, p is a fixed point. Any other fixed 
point g, must verify 


d(p, q) = d(f(p), f(@) < Cd(p, q), 


which is not possible since d(p, gq) > 0 and C — | < 0. In conclusion, p is the only 
fixed point of f and every orbit converges to p. 


Under the hypotheses of the Banach Contraction Principle, the method of succes- 
sive approximations provides a fast algorithm for solving numerically equations of 
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the form f(x) = x. It yields approximate solutions which verify both the conditions 
(AS 1) and (AS2) mentioned in Sect. 6.4. Indeed, looking back in the proof, we see that 


nN 


1-C 


d(Xntk, Xn) < d(Xn+15 Xn) < d(x1, Xo), 


1 
1-C 
for all k,n € N. Passing to the limit as k — oo, we infer that 


7 


1—C 


d(p, Xn) < d(Xn+1; Xn) < d(x, x0). 


1 
1-C 
This yields not only the convergence x, — p, but also the fact that the distance 
d(Xn+1, Xn) between two successive approximations offers a practical control on the 
distance of the approximation x, from p. 

The method of successive approximations is more general than the Contraction 
Principle. In other words, the method may work even for functions that are not 
contractions. An example is offered in Exercise 3. 

An easy-to-check condition (due to Oscar Perron), which implies that a periodic 
orbit is an attractor or a repellor is that of hyperbolicity. A periodic orbit O(p) of a 
C! differentiable function f mapping an interval / into itself is called hyperbolic if 


ler"y @)| #1, 


where m is the prime period of p. An easy computation shows that this condition does 
not really depend on the point generating the periodic orbit. In fact, by the Chain Rule, 


gy fey) =e" FO) -£¢" 26 @)) fF) 
afr" @)-fE" “Ost Do 
for all k € {0,...,m— 1}, which yields 


F"y'@) = F'")'(p) for allx € O(p), 


A.1.3 Theorem Every hyperbolic orbit O(p) of aC! differentiable functionf : I > 
I is either an attractor or a repellor. More precisely, assuming that the prime period 
of p is m, the following two alternatives occur: 

(a) If |)’ @)| < 1, then there is an open neighborhood U of O(p) such that 
f(U) C U and for each x of U we have 


lim d (f(x), O—)) = 0. 


(b) If | (fy (p)| > 1, then there is an open neighborhood V of O(p) such that for 
each x in V \ O(p), one can find a positive integer n for which f" (x) € V. 


Proof For the sake of simplicity we shall consider here only the case of fixed points. 
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(a) Since f has continuous derivative and lf’ (p)| < C < 1, thereise > Osuch that 
f’@x)| <C on U=(p—e,pt+e)Ni. 
According to the Mean Value Theorem, for every x in U we have 
If) — pl = If) —f(p)| < Clx — pl <e. 
Therefore f(x) belongs to U and, by iterating the above argument, we get 


f(x) — p| < C" |x —p|_ foralln EN, 


hence f” (x) > p. 


The assertion (b) can be argued in a similar manner. 


We will illustrate Theorem A.1.3 by considering the case of the odd function 
f@ = (3x = -) /2, x € R. This function has three roots, —/3, 0 and J/3 and 
three hyperbolic fixed points, —1, 0 and 1. The points —1 and 1 are attractors, while 
the origin is a repellor. The graphical analysis shows that all points in the intervals 
(-V3, 0) and (0, /3) iterate respectively to —1 and 1. See Fig. A.4. A little extra 
work shows that the domain of attraction of | is the union of intervals 


Ur" ((0, v3) = (0, /3) U[-2, —/3) WD aks 


n=0 


Fig. A.4 Nearby points iterate to different attractors 
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while the domain of attraction of — 1 is symmetric to this one with respect to the origin. 
Here f-"(X) = fo G 2 ( Fe) ...), 2 times. The length of the components 
gets smaller with n. They are all contained in the interval (4/5 ; J/5 ), and this interval 
decomposes into the domain of attraction of 1, the domain of attraction of —1 and 
the repellor 0. Notice that {—/5, /5} is a periodic repellor. 

The next example discusses the case of a discrete dynamical system generated by 
an isometry whose orbits are either periodic or dense. 


A.1.4 Example (The dynamics of rotations) For 9 € R, we define the rotation of 
angle 6 on the unit circle as the function 


Ro: S'—>S', Re(z) = ez. 


If # € Q then every orbit of Rg is periodic. Indeed, if $ = % with 
p,q € Z, then 


R@=e'Pz=e?"z=7 forallzeS'. 


A result known as Jacobi’s Theorem asserts that in the case where a 
then every orbit of Rg is dense in S'. Indeed, since so is irrational, the points z, 


Ro(z), ee (z), ... are distinct. Since S! is compact, there must exist a strictly increas- 


is not in Q, 


ing sequence of indices k(n) such that (Re (z))n is convergent. Then, for every 
€ > O, there is a natural number N, such that 


Re ej — RE) <2, 


whenever m,n > Nz. PutN = k(N, + 1) —k(Nz). Since rotations preserve distance, 
we have 
k (Ne 
[RQ () — zl = IR5 (RY @) — 21 


k(Ne+1 k (Ne 
= [REND zy — REM) (2) | < 


and thus the sequence z, RY (z), a (z), ... divides S! in arcs of chordal length 
less than ¢. Clearly, any point w € S! belongs to such an arc, which yields a natural 
number n such that |w — Rj(z)| < é. 

For 6 = 1, Jacobi’s Theorem easily yields that both sequences (sin), and 
(cos), are dense in the interval [—1, 1]. 


Exercises 


1. Use the graphical analysis to determine the dynamic of the function f from R into 
itself given by 


f@ = x —x. 
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2. 


Co 


A diffeomorphism f : [a, b] > [a, b] is called a Morse-Smale diffeomorphism if 

all its periodic orbits are hyperbolic. Prove that: 

(a) The map f(x) = (x3 + 3x) /4 is a Morse-Smale diffeomorphism on 
[—1/2, 1/2]. 

(b) A Morse-Smale diffeomorphism can have only a finite number of fixed points. 


. Prove that the sequence 


Xo =a X41, =Sinx, forn>0 


converges to 0, the unique fixed point of the function sinx, whenever a € R. 


. Let f : [a, b] > [a, b] be a Lipschitz function (of constant C) and F : [a, b] > 


[a, b], F(x) = (1 —A)x + Af (x), where A = 1/(1 + C). Prove that F is increasing 
and that for any x9 € [a, b], the sequence x, = F” (x9) converges to a fixed point 


of f. 


. Let f : [a,b] — [a, b] be a continuous function. Prove that for any xo € [a, b] 


the sequence (x;,), defined by the formula 


NXp + f Xn) 
=o ees > 0, 
Xn+1 rae n= 


converges to a fixed point of f. 


. (Sharkovsky’s Theorem). If a continuous function f : [a,b] — [a,b] has a 


periodic point of prime period 3, then f has periodic points of any other prime 

period. Give an argument following the next 4 steps: 

Step 1: If f({a, b]) contains an interval [c, d], then there is an interval [a’, b’] C 
[a, b] such that f ([a’, b']) = [c, d] and f ({a’, b’}) = {c, d}. 

Step 2: If A is a nonempty compact subinterval of [a,b] and A C f(A), then f 
has a fixed point. 

Step 3: Suppose that c € [a,b] is a point of prime period 3 and f(a) = c, 
fo.) =b, f(b) =a. Letn > 2andlb = =---=[],-2 = 1, = [a, 5] 
and J,-1 = [a,c]. There is a decreasing family (Ak)p—y of compact 
subintervals of [a, b] such that f* (Aj) =], forallk = 0,1,2,...,”. By 
Step 2, f” has a fixed point in A,. Thus, f has a periodic point of (prime) 
period n. 

Step 4: Conclude that the existence of a point of prime period 3 implies the 
existence of periodic points of any other prime period. 


. Consider the piecewise linear map f : [1,5] — [1,5] for which f(1) = 3, 


fQ) =5, fB3) = 4, f@ = 2 and f(5) = 1. Prove that f has points of period 5 
but no points of period 3. This example can be easily adapted to an example of 
a piecewise linear map f : [1,27 + 1] — [1, 2n + 1], that has points of period 
2n + | but no points of period 2n — 1. 


. Give an example of a piecewise linear map f : [1,4] — [1, 4] that has points of 


period 2” but no points of period 23. 
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A.2 Sensitive Dependence on Initial Conditions 


When we follow the trajectory of a point x9, we must keep in mind that actually 
we are working only with an approximation of xo (and the best we can expect is the 
knowledge of a neighborhood of the trajectory of x9). Assuming that the generator 
f : M — M is continuous, for every ¢ > 0, there is 5 > 0 such that d(f(x), 
Ff (xo)) < €, whenever d(x, x9) < 6. But nothing assures us that 


d(f" (x), f" (x0) < € 
for all n > O. On the contrary, if we look at the doubling function on the unit circle, 
fos 38. faye, 


we see that for some exponents n, exactly the opposite is true: trajectories starting 
arbitrarily close rapidly separate and thereafter have different futures. This is the 
phenomenon of sensitive dependence on initial conditions. 


A.2.1 Definition A discrete dynamical system (M, f) exhibits sensitive dependence 
on initial conditions if there is a number 6 > 0 such that for every a € M and every 
neighborhood V of a, there are x € V and n > | such that 


d(f"(a),f"(x)) > 6. 


In contemporary science, the sensitive dependence on initial conditions is regarded 
as the main feature for the chaotic behavior of a dynamical system (and also for the 
concept of chaos). 


A.2.2 Definition A discrete dynamical system (M, f) is called chaotic (in the sense 
of Devaney) if it satisfies the following three conditions: 


(T) topological transitivity, that is, for any pair of nonempty open subsets U,V Cc M, 
there exists an integer n > 1 such that f”"(U) NV 4G; 

(P) the density of the set of periodic points in M; 

(S) sensitive dependence on initial conditions. 


This definition had a great impact in the popularization of the theory of chaotic dyn- 
amical systems. Nevertheless, the three requirements mentioned in Definition A.2.2 
are not independent. Indeed, as was shown by Banks et al. [1], if the metric space M 
is not a finite set and (M, f) verifies the conditions (T) and (P), then f also verifies the 
condition (S). On intervals, things are even more striking. A result due to Block and 
Coppel [2] asserts that if I is a nondegenerate interval, then any continuous function 
f : I — I which verifies the condition (T) of topological transitivity is chaotic. 


A.2.3 Lemma The doubling function on the unit circle is chaotic. 
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Proof Using the trigonometric form of complex numbers, we see that f can be 
expressed by the formula f(cos 6, sin) = (cos 26, sin 20). Thus, this function dou- 
bles the angular distance between points upon iteration. As a consequence, f exhibits 
sensitive dependence on initial conditions. Topological transitivity also follows from 
this remark since any small arc in S! is eventually expanded by some iterate to cover 


2ka sin 2ka 


all of S'. The periodic points of period n of f are the points (cos a 24 ) for 


k =0,1,...,n— 1. As they are the vertices of a regular n-sided polygon inscribed 
in the unit circle, with one vertex at 1, we conclude that the set of all periodic points 
of f is dense in S!. 


A simple technique of proving the chaotic behavior of a system is based on 
topological conjugacy. 


A.2.4 Definition A dynamical system (M, f) is topologically semiconjugate to the 
dynamical system (N, g) if there exists a continuous surjective function h : M — N 
which makes the diagram 


> 


M 


> 
< 


M 
Lh 
N WN 

g 


commutative, that is, goh = hof. Whenh is ahomeomorphism, we say that (M, f) 
is topologically conjugate to (N, gq). 
Notice that the condition goh =hof implies g’oh=hof"” forallne N. 


A.2.5 Proposition Suppose that M and N are two metric spaces containing infinitely 
many points and f : M — M and g: N — N are two continuous functions. If 
the dynamical system (M, f) is chaotic and semiconjugate to the dynamical system 
(N, g), then (N, g) is chaotic too. 


Proof Leth: M — N beas in Definition A.2.4. Clearly, h carries the properties (T) 
and (P) from (M, f) to (N, g). As concerns the property (S), this follows from the 
aforementioned result of Banks et al. [1]. When h verifies the condition 


d (h(x), h(y)) > da, y) forallx,y eM, 


the fact that (NV, g) has sensitive dependence of initial conditions follows directly 
from Definition A.2.1. 


An immediate consequence of Lemma A.2.3 and Proposition A.2.5 is the chaotic 
behavior of the dynamical system generated by the Chebyshev polynomial of order 2, 


GHG@j)=2e —1 for xe [21,1], 
If we substitute x = cos 0, we get T2(cos 8) = cos(26), so that the dynamical sys- 


tem generated by the doubling function on the unit circle, f (z) = 2”, is semiconjugate 
to that generated by 7> via the mapping pry : S' = (-—1.1), pr,(z) = Rez. 
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Fig. A.5 The points of Aj 
escape [0, 1] after first iterate 


In turn, the latter dynamical system is conjugate to the system generated by the 
function, 


F4:[0,1] > [0,1], Fa(x) = 4x(1 — x). 


via the homeomorphism / : [—1, 1] > [0, 1], h(@~) = s(1 — x). Thus, the function 
F4 generates a chaotic dynamical system. The orbits of this dynamical system have 
a simple trigonometric description, x, = 1-282" arceos(1— 20) forn EN. 

The function F4 belongs to the family of quadratic functions, 


Fy: [0,1] > [0,1], Fyx@) = ax — x), 


known as logistic functions. For 1 > 4, the maximum of F), is greater than 1 and 
some points escape [0, 1] under iteration. As shown in Fig. A.5, the preimage of 
[0, 1] under Fy, consists of two closed intervals /; and J (symmetric with respect to 
5): Each of these intervals is mapped by F’ 4 onto F), ([0, 1]) , so to keep their image 
within [0, 1] we need to eliminate from J, and /2 suitable open subintervals and so on. 

At the end, it remains an invariant subset A, = pr F,"((0, 1)) of [0, 1], on 
which the dynamical system generated by F, has the same features as in the case of 
F4: density of periodic points, existence of dense orbits and sensitivity to the initial 
conditions. The only difference is that A, is thin (homeomorphic to a Cantor set). 
The proof of this result is based on conjugacy of F), with the shift function from 
the theory of symbolic dynamics. See the paper of Kraft [3] for details. The chaotic 
character of the dynamical system generated by the shift function makes the objective 
of Exercise 4. 


Exercises 


1. Use semiconjugacy to prove that the following functions generate chaotic dynam- 
ical systems: 
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(a) The tent map, T : [0,1] —> [0,1], T(x) = 2x for x € [0, 5]. and T(x) = 
2 — 2x forx € [4, 1]; 

(b) The function f : [0, 7] — [0, 7], f(x) = zm sinx; 

(c) The Chebyshev polynomials T,,(x) = cos(narccos x) for x € [—1, 1] and 
n> 2. 

2. Suppose that M is a perfect metric space (that is, without isolated points). Prove 
that any discrete dynamical system (M, f) which has a dense orbit is topologically 
transitive. 

3. Consider the sequence space 2N | introduced in Exercise 4, Sect. 6.7. Prove that 
the shift function, 


N 
o :2N-5 2N) og (x0, x1, 42, --)) = (1, 02, 93,--) 


generates a chaotic dynamical system. 
[Hint: A point w with dense orbit can be obtained by gluing all strings of 0 and 
1 of length n, forn = 1,2,3,..., that is, 


wo = (0,1, 0,0, 0,1, 1,0, 1,1,0,0,0,0,0,1,...... ).] 
ee 


A.3 Notes and Remarks 


There are many introductory books on the theory of dynamical systems. We mention 
here those of Alligood et al. [4] and Devaney [5]. 

Sharkovsky’s Theorem mentioned in Exercise 6, Sect. A.1, is actually a conse- 
quence of a more powerful result (also due to Oleksandr Mykolaiovych Sharkovsky). 
Consider the set of positive integers, endowed with Sharkovsky’s ordering, 


$5 a Tres 24095 4047 0 a a ee? ar anak 


the list starts with the odd numbers in increasing order (1 is left out), then it continues 
with the same numbers multiplied by 2, next with the same numbers multiplied by 
2? and so on. At the end there are the powers of 2 in decreasing order, the last entry 
being 2° = 1. 


A.3.1 Sharkovsky’s Theorem /fm <n and f has a periodic point of prime period 
m, then f has a periodic point of prime period n. 


For details, see Devaney [5], pp. 60-68. 

Sharkovsky’s Theorem is specialized to intervals. The rotation of angle 27/3 on 
the unit circle has periodic points of prime period 3 but no periodic point of any other 
prime period. 
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The discovery that even quadratic maps may have intricate dynamics on certain 
invariant subsets surprised mathematicians. The most famous example in this respect 
is the logistic family 


Fy(x) =Ax—x), x ER, 


for 4 > 1. As was pointed out by May [6], these functions may be thought of as 
an idealized model for the evolution of a biological population over time; x is the 
population density in a certain generation (confined to a certain environment), and 
F(x) is the population density of the next generation. F(x) = 0 means extinction 
and F,(x) = | means saturation. 

The critical zone for the behavior of the iterates of F), is [0, 1]. In fact, if x < 0, 
then F, (x) < x, so that (F}'(x))n is a decreasing sequence of negative numbers. Its 
limit, say /, is necessarily —oo since otherwise / € IR and 


f= lim Fr @) =F, (lim, Fi(@)) = FD <1, 


a contradiction. If x > 1, then F;,(x) < 0, which yields F(x) — —oo too. 

The map F has two fixed points: the origin 0 and p, = 1 — 1/A. The origin 
is a repellor, while p, is an attractor for 1 < A < 3, with (0, 1) as a domain of 
attraction. When A passes 4; = 3, the dynamics of Fy, changes; we call A; a value 
of bifurcation. The point p, becomes a repellor and an attracting orbit of period 2 is 
created. The next value of bifurcation is Az = 1 + /6. Past A2, the latter periodic 
orbit becomes a repellor and an attracting orbit of period 27 is created. Increas- 
ing A, a cascade of periodic orbits (with doubling periods, 2, 27, 23,...) occurs. 
The corresponding sequence of values of bifurcation, (An), converges to a value 
doo = 3.569946... . The ratios of the distances between successive bifurcations 
converge too, 

An — A’ 


lim: 2 "= = 4 6920... . 
nOO)ni1 — An 


IfA > Ao, then the dynamics becomes more irregular. For some values of 1, there 
exist periodic orbits, which in turn give rise to new sequences of period doubling 
bifurcations, in a succession motivated by the Sharkovsky Theorem. 

For A* = 14+2,/2 = 3. 8284..., anattracting periodic orbit of period 3 appears: 


a = 0.1599..., Fyx(a) = 0.5143..., Fy (a) = 0.9563..., 
F}.(or) = 0.1599... 


A simple proof can be found in [26]. This orbit can be easily visualized with a 
computer by iterating (for a sufficiently large number of steps) any point of (0, 1). 

According to Sharkovsky’s Theorem, F',« admits periodic orbits of any period. 
However, being repelling, they do not appear on the computer screen! 
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For larger values of A (but not for all of them) there exist orbits which are dense 
in the whole interval [0, 1]. In fact, it was noticed by Graczyk and Swiatek [27] and 
Lyubich [29, 30] that: 


e The set Kf of values of A € [1, 4] for which F,, has a periodic attractor is open and 
dense. 

e Except for a Lebesgue negligible subset of [1,4] \ 9%, the map F, has a dense 
orbit. 


Cantor’s triadic set has many interesting features which are best understood within 
the theory of iterated function systems, as developed by Hutchinson [28]. His basic 
remark is as follows: 


A.3.2 Theorem Consider a complete metric space M. 


(a) The set C(M), of all nonempty compact subsets of M, is a complete metric space 
with respect to the Pompeiu-Hausdorff metric, 


dpy(A, B) = max } sup inf d(a, b), sup inf d(a, b);} . 
acA b<B beBacA 


(b) LetT\,..., Ty be contractions of M into itself, of Lipschitz constant 2. Then the 
function 


T:K(M) > K(M), T(A)=7,(A)U---U Ty(A), 


is also a contraction, of Lipschitz constant 2. 


The dynamical system generated by the function JT is called iterated function 
system. According to the Contraction Principle, it admits a global attractor A, which 
verifies 


A=T\(A)U---UTy(A). 


The Hausdorff dimension of this attractor A verifies the relation 


. log N 
dimy(A) < — : 
logr 
Cantor’s triadic set corresponds to the case where M = [0,1], N = 2 and 


T(x) = x/3, T2(x) = @ + 2)/3. 


Appendix B 
Special Topics on Differentiability 


This Appendix is devoted to a discussion of the Fundamental Theorem of Calculus 
within the framework of Lebesgue integral. The central ingredient is the almost 
everywhere differentiability of monotone functions, which was proved by Lebesgue 
in his celebrated book on integrals and antiderivatives [7]. 


B.1 Dini’s Derivatives 


Let a € R. If f is a real-valued function whose domain has a as a point of accumu- 
lation, then the inferior limit and the superior limit of f at a can be computed by the 
formulas 


liminf f(x) = sup inf f() 
xa Vev, xeEV\ {a} 


and 


lim supf(x) = inf sup f(x). 


xa VeVa xeV\{a} 


See Sect. 6.6. These limits always exist as elements in R and constitute a refinement 
of the notion of limit at a point. 

If the domain includes intervals of the form (a, a + e] (for some € > 0), we can 
also consider the right inferior limit of f at a and the right superior limit of f at a, 
respectively defined by 


lim inf /@) = sup inf f(t) 


x€(a,ate] E(x] 
and 


limsupf(x)= inf — sup f(). 


x>a+ xE(a,at+e] te(a,x] 


© Springer India 2014 481 
A.D.R. Choudary and C.P. Niculescu, Real Analysis on Intervals, 
DOI 10.1007/978-8 1-322-2148-7 


482 Appendix B: Special Topics on Differentiability 


These limits always exist as elements in R and constitute a refinement of the notion 
of right limit at a point. 

Similarly, if the domain includes intervals of the form [a —, a) (for some ¢ > 0), 
we can define the /eft inferior limit of f at a, 


lim inf f(@) = sup ~ fQ@, 


x€la—e,a) t€lx.@) 


and the left superior limit of f at a, 


lim sup f(x) = Pes sup f(t). 


x a— XEl4—€.4) tel x,a) 


They constitute a refinement of the notion of left limit at a point. 
It is not difficult to see that the following type of statement holds: 


lim inf f(x) = inf lim inf f(a,). 
x>a+ anja Noo 
Dini’s derivatives are defined in a similar way. If f is areal function whose domain 


includes intervals of the form (a, a + €] (for some ¢ > 0), we define the right lower 
Dini’s derivative by 


Dif(a@ = lim inp LO 
—a 
and the right upper Dini’s derivative by 
D* f(a) = lim sup LO ) ~F(@) 
xa+ a 


If f is a real function whose domain includes intervals of the form [a — ¢, a) (for 
some € > 0), we define the left lower Dini’s derivative by 


D_f(a) = lim inp LO -F@ 
x—>a— x-—a 
and the left upper Dini’s derivative by 
D f(a) = lim sup OL) ) = f(a) 
x>a- —a 


It is clear that D_f(a) < D~f(a) and Df (a) < Dtf (a) and that a function has 
a derivative at a point if and only if all Dini’s derivatives that can be defined at that 
point are equal. 


B.1.1 Lemma For every function f : (a,b) — R, the set of all points where both 
one-sided derivatives exist and are different is countable. 
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Proof The set in the statement is the union of the sets 
A=(s:f@) <f/@} ande=2:fG) <f@}. 


We will show that A is countable by constructing an injective function from A into 
Q?. A similar statement works for B. 

If f/(x) < f/ (x), then there is a rational number r, such that f/(x) < ry < ff (x). 
Let u, and v, be rational numbers such that a < uy < x < vy < band 


£O) ~ Ft) >r, for y € (uy, x) 
y-x 
and 
FQ) —f@) 


——_—— <r, for ye (x, vx). 
y-x 


Notice that in both cases we get 
FQ) —f@) < nO — x). 


Letg:A-> Q3, g(x) = (ry, Ux, Vx). Suppose that there are x ~ y such that 
p(x) = v(y). Then x, y € (ux, Vx) = (uy, Vy) and so 


fO) —f@) < nly — x) 
and 
f@)-f0) <n@-y) =m y). 


By adding the last two inequalities side by side we obtain 0 < 0, which is a 
contradiction. Therefore ¢ is injective. 


Exercises 


1. Let 


1 
xsin- if x40 
fQ@) = x 
0 if x=0. 


Compute Dini’s derivatives of f at 0. 
2. Prove that if f(x) = g(x) + Cx (where C is a constant), then D*f(a) = 
Dt g(a) +C. 
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3. Prove that: 
(a) If Dtf(a) <_c, then there is 6 > O such that 


f@)-f(@ <ca-—a) forallx € (a,a+5). 


(b) If Dt f(a) > c, then there is a sequence (ay), decreasing to a, such that 
Ff (an) —f(@ > c(am — @) for every n. 
4. Prove that if D_f(a) > 0 > D*f (a), then a is a strict local maximum for the 
function f. 
5. (W. Sierpiriski). Prove that the set of all strict local maximum points of a real 
valued function is countable. 
6. (A. Denjoy). Let f : J > R. Prove that except for a countable set of points, 


D_f(x) <D*f(x) and Dyf(x) < Df). 


[Hint: Let H = {x : Dtf(x) < D_f(x)}. Then H = UreQH;, where H, = 
{x : Dt f(x) < r < D_f(x)}. By Exercise 4, H, consists of points of strict local 
maximum for f (x) — rx. By Exercise 5, this set is countable. ] 
7. (A generalization of Lagrange Mean Value Theorem). Let h : [a,b] ~ R bea 
continuous function. Prove that there is a point c € (a, b) such that 
prc) < “-" < Divo) 


Here the ower and respectively the upper derivative of h at c are defined by 


mn and Dh(c) = lim apo 


Dh(c) = lim inf 
x7 C¢ x->Cc x= 
[Hint: Consider the function H(x) = h(x) — “2-4 (x — a) for x € [a, b].] 
8. Use the argument of Theorem 8.11.1 to show that the set of all functions f in 
C([0, 1], R) which have at least one infinite right Dini derivative at every point 
of [0, 1) is dense in C((0, 1], R). 


B.2. Lebesgue’s Differentiation Theorem 


Let A be a subset of R. A family V of closed nondegenerate intervals is called a Vitali 
covering of A if for every x € A and every ¢ > 0, there is an interval J € V such that 
x € Zand €(/) < ¢ (that is, every point in A belongs to arbitrarily short intervals in V). 


B.2.1 Theorem (Vitali’s Covering Theorem) Let A be a subset of R and let V be a 
Vitali covering of A. Then, there is a countable subset W of V, consisting of mutually 
disjoint sets and such that A\ Ujewl is a Lebesgue null set. IfA is a bounded set, then 
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for every € > 0, there is {J,, Jz, ..., Js} a family of mutually disjoint intervals in V 
such that A \ U;_, J can be covered by sequences of intervals of total length < «. 


Proof First, we will consider the case where A C (0, 1). 
Let J; € V such that €(,) > 5 sup {€U) : I € V}. Then, choose Jy € V such that 
hNh = and 


1 
L(h) => 5 sup {@d): TeV, Ink = 9}. 
Next, choose /3 € V such that 3.9 ( U 2) = Y and 
1 
Lb) = 5 sup {() : TeV, IN Uh) = 9} 


etc. This process produces a countable family (x), of mutually disjoint intervals in 
Y. If the family is finite, then A C LU, & and the proof is complete. 
For the rest of the proof, we will consider that the family (7), is countably infinite. 


Claim: If/ € VY, then 


o(U )#a (B.1) 


k=1 


Proof of Claim: Suppose not. Then there is an interval J € V such that/ (Unt) = 9 
and 


1 CO 
lN25 vp | eV, n(Ux) =o]. 


Since the intervals /,, are disjoint and included in [0, 1], necessarily €(/,,) > 0. Thus, 
there is kg such that €(J,,) < €(J)/2. This implies 


ko—-1 
1 1 1 
5 EU) > Clio) = 5 sup pe) : TeV, 10 U he) =O} = 520), 


which is a contradiction. This finishes the proof of the claim. 
We shall show that A\ (U2, J) is a null set. It suffices to prove that for every 


é > 0, there is a positive integer h such that the set A \ (Ut. 1 I) can be covered 


by a sequence of intervals of total length less than ¢. 
Let n > 0 arbitrarily fixed and choose h such that ye no €Uk) <n. Then 


a\(Us)e U{rsrev. rn (Ui) =9} 


k=1 
= {z TEVA (Uist) = 0,10 (Ung) A a| 
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Cn ice 


U (Ue: TeV, 10 (Uk) = 9. 10 T+ £9) 


U (Ue: TEV, &0) < U4), FO Fat #03). 


tea, 
Il 
> 


The first inclusion above is true because the set (i 1 4x is compact (which implies 


that for every x € A \ (Ui. 1 Ix) there is an interval J(x) € Y with the property that 


Ta~)n ( 7 1 a) = %). The next equality is true because of the Claim. The rest of 


the argument is a consequence of the choice of the intervals J;. 
Let Tu — [uj+i, vi+t). Then 


Lure v, €@ < 2€Gj41), 10 dur FD} 


is included in [ups — 241), Vint + 2€(0}+1) | . Thus, A \ (Ui-1 Ik) can be cov- 
ered by a family of intervals of total length less than 


5: >) i) < 50. 


j=h+1 


Choosing 7 > O such that 57 < e, will complete the argument in the case where 
Ac (0, 1). 

In the general case, we repeat the reasoning above for each of the intersections 
AN (n,n +1), forn € Z. 


B.2.2 Lebesgue’s Differentiation Theorem A monotone function, defined on an 
interval, is almost everywhere differentiable. 


Proof Clearly, it suffices to consider the case of increasing functions f : [a, b] > R. 
We will first show that the set 


A={x:x€[a,b), Dyf(x) < Dtf(~@} 


is a Lebesgue null set. Since A is the union of the following countable family of 
measurable sets, 


Ap = {xix €A, Dif(x)<p<q<D'tf@}, 0<p<¢gpqgEQ 


we have to prove that each of the sets Ap, is a Lebesgue null set. 
Assume that a certain set Ap,, has measure a > 0. Let € be such that 


a(q —p) 


O<éex< . 
p+ 2q 


Since 
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oe) oe) 
inf {> £Un) : Un)n sequence of open intervals with U In D Anal =a> 0, 
n=0 n=0 


one choose a sequence (J,), of open intervals such that U, In = U D Ap,g and 
>, Un) < a+. Therefore for every x € Ap.g there are numbers h > 0, arbitrarily 
small, such that [x, x +h] C UN [a, b] and 


fath) —f() < ph. 


The family V of all these intervals [x,x + h] is a Vitali covering of A,g and 
so, by Vitali’s Covering Theorem, there is a finite pairwise disjoint subfamily 
(Lyi, 47 + hj]}V_, of V such that 


Ap,q \ (U Lx, xi + mi) 


i=1 


can be covered by sequences of intervals of total length less than e«. 
Let V = U7, [ai x; + hi]. Since V C U, we have 37", hj < w+. Then 


m 


> F@i +h) —f@d) <p>) hi< plate). 


i=1 i=1 


Similarly, for every y € Ap, 1 V, there exist arbitrarily small numbers k > 0 
such that [y, y+ k] C V and 


SO +k) —f) > ak. 


The family of all these intervals [y, y + k] is a Vitali covering of Ap,g. 1 V, and so 
there exists a finite subfamily {by, yt kl} , of pairwise disjoint intervals, such 
that , 


(Apg NV) \ | U by +4 
j=l 


can be covered by sequences of intervals of total length less than ¢. Taking into 
account the decomposition 


Ang = (Ap.q \ V) U (Ang a) Vv) 


we infer that 
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Thus, 


q@a—28) <q DL & < >) FO; +%) -FO)). 


j=l j=l 


Since U7_ [yj 97 +] C UfL1 Di. x + hi] and f is an increasing function, we 
also have | 


> (or + 4) -FO)) < >> FG + hd) — fF). 


j=l i=1 


This implies g(a — 2) < p(a + €), which contradicts the choice of ¢. Therefore 
A is a Lebesgue null set and ia (x) exists almost everywhere. Similarly, 2 (x) exists 
almost everywhere on [a, b]. By Lemma B.1.1, it follows that f’(x) exists almost 
everywhere. 

It remains to show that the set X, of points x in (a, b) for which f’(x) = oo isa 
Lebesgue null set. Let y > 0. For every x € X, there exist arbitrary small numbers 
h > 0 such that [x, x + h] C (a, b) and 


fa +h) —f@) > h/y. 
The family all such intervals [x, x + h] is a Vitali covering of X and so, by Vitali’s 


Covering Theorem, there exists a countable subfamily consisting of pairwise disjoint 
intervals [X), X, + 4p], such that X \ (Ulta, Xn + In) is a Lebesgue null set. Since 


A(X) = a(x (Un + ial) +a(xn (Un +m) 


<> In 
n 


<V¥-° >) Fn + hn) —f@n)) < ¥- FO) -fO), 


and y > 0 was arbitrarily fixed, we conclude that X is a Lebesgue negligible set. 


An apparently open question is the existence of an increasing function with a 
prescribed set of nondifferentiability. More precisely, if A C [a, b] and A(A) = 0, is 
it possible to find an increasing function f : [a,b] — R such that f’ exists exactly 
on (a, b)\A? 
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As a consequence of Lebesgue’s Differentiation Theorem, we will prove the fol- 
lowing result concerning the differentiation term-by-term of series of increasing 
functions: 


B.2.3 Theorem (Fubini’s Differentiation Theorem) Let (f,), be a sequence of 
increasing functions on an interval [a, b] such that plete Snr(x) = S(x) exists and is 
finite for every x € [a, b]. Then 


[ee 


Sx) = Sy (x) almost everywhere. 
n=0 


Proof Without loss of generality we may assume that all functions f, are positive 
and vanish at x = a; for this, replace f,(x) by f,(x) — f, (a) if necessary. 

By Lebesgue’s Differentiation Theorem, there exists a Lebesgue null set X such 
that all functions f(x) and S(x) are differentiable on [a, b] \ X. For all x € [a, b] \X 
and y € x in [a, b], we have 


S(y) — SQ) 


y-x 


1 [o,@) 
— J fi) — fn) = 
y-x 
n=0 
and taking into account that the series is positive, we infer that 


N 
SS po-4as 


=, yO * 


for any natural number JN. Passing to the limit as y > x, we get 


N 
Si@W=9@), 


n=0 


from which, letting N — oo, we deduce that the positive series °°) f/(x) is 
convergent and its sum is bounded above by S’(x) for every x € [a, b] \ X. The proof 
ends by showing that the sum of this series is precisely S’(x). 

Put 


n 
Sn) = Dif), nN, 
k=0 
and choose an increasing sequence (k,), of natural numbers such that 


n 


0 < S(b) — Sx, (b) < = 


for every n. Since S(x) — pane) (x) = Die, 11 tEO) is the sum of a series of 
positive increasing functions, we get 
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1 
0 = S() — 5k) < 55 for alln € Nandx € [a, b] \X. 


Therefore the series ba (S (x) — Sx, (x)) (consisting of increasing positive func- 
tions) is uniformly convergent and the above reasoning yields the convergence of 
2 (s '(x) — S a (x)) . Then § ky (x) — S'(x) almost everywhere. Since (S/,(x))n 
is increasing for every x € [a,b]\X, we conclude that S’(x) — S'(x) for all 
x € [a,b] \X. 


Lebesgue’s Theorem can be strengthened by using the following result, which 
made the object of Exercise 6, Sect. 11.3. 


B.2.4 Lemma (Fatou’s Lemma) Let (f,)n be a sequence of positive measurable 
functions. If 


nC 


limint ff dt < oo, 
R 


then lim inf f, is integrable and 
n— oo 


[iim inf f, dt < lim inf Js dt. 

n> Co n—-> Oo 

R R 

B.2.5 Theorem Let F : [a,b] — R be an increasing function. Then F is almost 


everywhere differentiable and the function F", extended arbitrarily at the points where 
F is not differentiable, is Lebesgue integrable. Moreover, 


b 
i F'(t) dt < F(b) — F(a). 
a 
Proof We extend F by defining F(x) = F(b) for x > b and F(x) = F(a) for x < a. 
Let 
F(x + 1/n) — F(x) 
1/n 


Fy(x) = 


By Lebesgue’s Differentiation Theorem, F,, — F’ almost everywhere on [a, b]. 
Since the functions F,, are measurable, so is F’ (see Theorem 11.5.3), and thus, 
Fatou’s Lemma applies. Consequently, 


b b b 


/ F'(t)dt = / lim F,(¢) dt < lim inf f: F(t) dt 
noo n—-> oo 


a a a 
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b 
= timintn [ [F(t + 1/n) — F()] dt 


a 


b+1/n a+l/n 
= lim inf n | Foaran [ F(t) dt 
nC 
b a 
< F(b) — F(a). 


The inequality in the previous theorem may be strict, as shows the case of the 
Cantor-Lebesgue singular function. See Exercise 4. In Sect. B.4 we will discuss the 
equality case. 


Exercises 


1. Let A be a subset of (a, b) that is not a Lebesgue null set; this means the existence 
of anumber ¢ > 0 such that >” n£Un) = € for any sequence (J;,), of open intervals 
that covers A. Use the Borel-Lebesgue Lemma to show that every family F of 
open subintervals of (a, b) that covers A contains a finite subfamily S consisting 
of disjoint sets J), ..., Jm such that Uke1 Ly) > €/3. 

2. Infer from the previous result that every increasing function f on [a, b] admits a 
finite lower derivative almost everywhere. 

Note An alternative proof of Lebesgue’s Differentiation Theorem based on 
Exercises | and 2 above can be found in the paper of Botsko [8]. 

3. Let f be a real-valued function defined on [a, b]. If A is a subset of [a, b] where 
f" exists and |f’(x)| < C for x € A, prove that A(f(A)) < CA(A). 

[Hint: Consider a Vitali Covering of f(A) by intervals [f(x), f(x + A)] such that 
f(x +h) CEO. f@+M1 

4. (The Cantor-Lebesgue singular function). Consider the function w (from [0, 1] 

onto itself) defined by 


w(0) =0, w0) =1; 
vx) = alan forx € Ing = (S - =): 
Qn 3n 7 3n 
ke {l,..., 2774} andn e N’; 
w(x) = sup {w(t); t< [0, 1]\ A, t < x}, 


where A denotes the Cantor set. Its graph is sketched in Fig. B.1. 
Prove that: 


(a) The function w is the uniform limit of the functions w,, whose graphs are 


the polygonal lines from (0, 0) to (1, 1) and which are horizontal W,(x) = 
2k-1 
7 ON 
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Fig. B.1_ The Cantor- 
Lebesgue singular function 


0 1/2 1 


2k—1 2k 

Tink = 3n , 3n 
fork =1,2,...,2"-! andm=1,2,...,n. 
(b) The function y is continuous and increasing. 


(c) Ifx € [0, 1]\ A, then there is an open interval containing x on which, starting 


at a certain rank, all functions y, are constant (and equal). Then show that 
w is differentiable at x and that y’(x) = 0. 


(d) w maps the Cantor set continuously onto [0, I]. 
(e) Wat+y) < WO) + WO) for all x, y € [0, 1]. 


B.3 Functions of Bounded Variation 


The variation of a function f : [a, b] > C is defined by the formula 


n—1 


Ve(f) = sup >> Ff On+1) —F OWI: 


k=0 


where the supremum is taken over all divisions A = {xo, ..., X,} of [a, b]. In general, 


0< Va) <0. 
If ve (f) < ©, then f is said to be a function of bounded variation. 
Clearly, ve (f) = 0 if and only if f is constant. 


Every increasing function f : [a, b] > R has bounded variation and 


ve (f) = f(b) —f@. 
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The set BV ([a, b]), of all functions having bounded variation on [a, b], is actually 
a linear space. Moreover, f has bounded variation if and only if its real and imaginary 
part have bounded variation. 

We want to determine the structure of functions of bounded variation. The key 
remark is that the restriction of any function f €¢ BV([a, b]) to a compact subinterval 
[c, d] C [a, b] has also bounded variation and 


vig) = Vi Flica) = V20/). 


The next result is obvious. 
B.3.1 Lemma Let f € BV({a, b]). Then: 
(a) Ifx < y < z, then 
Vf) + VE(f) = VE(f). 


(b) The function V : x > V%(f) is increasing. 


B.3.2 Jordan Decomposition Theorem A function f : [a,b] — C has bounded 
variation if and only if it can be represented as a linear combination 


f=fhi-A+ik-fy) 


of four increasing functions f), f2, {3,4 defined on [a, b]. 


Proof It is enough to prove that every real-valued function with bounded variation 
is the difference of two increasing functions. In this case we may choose f| = V?(f) 
and f2 = Vi(f) — f. It remains to prove that f> is increasing. Indeed, if x < y, 
then 


0) -—Ah@ = VA) —fO) — Wa) —£@) 
= Vi(f) — 0) —f(@)) = 0, 


by Lemma B.3.1 and the definition of variation. 
We are thus led to the following extension of Lebesgue’s Differentiation Theorem: 


B.3.3 Theorem Let I be an interval and let f : I — C be a function whose 
restriction to any compact subinterval has bounded variation. Then f is almost 
everywhere differentiable and its derivative is Lebesgue integrable on each com- 
pact subinterval. 


Every Lipschitz function f : [a, b]  C has bounded variation since 


n—1 n—1 
SF Gm) FOR) < Me llcip D beet = 24 
k=0 k=0 


= |[flInip (b — a) 
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for every division of [a, b]. In practice we are interested in a more general class 
of functions, that of locally Lipschitz functions (that is, Lipschitz on every compact 
interval). For example, the convex functions defined on open intervals are locally 
Lipschitz. In fact, if f : 7 + R is such a function and if [a, b] Cc J, then, according 
to the results presented in Sect. 8.9, 


LO-f@ _fO-fO) — fO) —fO) 
x-ada ~— y-x ~ b-y 


’ 


whenever a < x < y < b. Thus 


£0) ~ Ff) 
y-x 


< max { fi (a) 


[fl @|}. 


According to Theorem B.3.3, the following general result holds: 


B.3.4 Corollary (H. Rademacher) Every locally Lipschitz function is almost every- 
where differentiable. 


Let us note that Clarke [9] has developed an analogue of the subdifferential for 
all locally Lipschitz functions defined on open intervals. 


Exercises 


1. Consider the function 


_ fx%sin@xF), ifx € (0, 1] 

FO=1 9, ifx =0, 
where a@ and # are positive constants. Prove that f has bounded variation if and 
only if a > £6. Infer from this example that there are continuous functions on 
[0, 1] which have not bounded variation. 

2. Prove that every function f € C : ({a, b]) has bounded variation and 


b 


vii = f [ro] a 


a 


[Hint: The inequality ve Qa l i lf’ (1)| dt is immediate taking into account 
that |f(v) —f(w)| < tis lf’ (1)| dt. For the other inequality, fix arbitrarily a point 
xo € [a, b) and consider the function s(x) = eres l[xo.x]) for x > xo. Noticing that 
f(x) — f Xxo)| < s(x)—s(x0) < Lise [f’(O)| dt, one can show that s’.(xo) = |f’(xo)]. 
If x) > a, a similar remark works for s} (xo), SO at the end we obtain that s is 
differentiable and s’ (x) = |f’(x)| for all x € [a, b].] 

3. Extend the result of the previous exercise for piecewise C! functions. 
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Note In geometry, a planar rectifiable curve is a curve C defined by a parame- 
trization g : [a,b] > R?, g(t) = (x(t), y(t)) , having bounded variation. By 
definition, the length €(C) of C is the variation ve (py). Thus, if g is piecewise C . 


b 
eC) = / Jeo)? + O'()Par. 


For example, the circle of radius R centered at the origin has the parametrization 
g(t) = (Reost, Rsint), t¢t € [0,27], 


and its length is 277 R. However, in most cases the computation of the length of a 
curve in compact form is not possible. The case of the ellipse 


w(t) = (acost, bsint), t € [0,27], 


imposed the consideration of a special function, the complete elliptic integral of 
the second kind, 


m/2 


By = [ Vie sina, k €[-1, 1). 
0 


Its values are computed by numerical methods. See Sect. 9.2, Exercise 13. 

4. Suppose that f is continuous and of bounded variation on [0, 1]. Prove that there 
exists a homeomorphism h of [0, 1] onto itself such that f o h is a Lipschitz 
function. 

5. (Yorke’s Inequality). Suppose that f : [0,1] — R is a function of bounded 
variation on [a, b] C [0, 1]. Prove that 


b 
2 
VbeFxina) $2VE + -—— | (pola 


B.4 Absolutely Continuous Functions 


The aim of this section is to discuss the problem of antiderivatives within the theory 
of Lebesgue integral. The basic notion is that of absolute continuity. 


B.4.1 Definition (Giuseppe Vitali) Let I be a nondegenerate interval. A complex 
valued function F is said to be absolutely continuous (that is, F € AC(/)) if for every 
€ > 0, there is 6 > 0 such that for every finite family ((ax, bx));_, of disjoint open 
subintervals such that 
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n 
DY) Oe =a) <8, 


k=1 
we have 
n 
> IF bx) — Fla)| <. 
k=1 


It is obvious that an absolutely continuous function is uniformly continuous and 
has bounded variation on every compact subinterval of J (in particular, it is almost 
everywhere differentiable). We have the following structure theorem. 


B.4.2 Lemma Let H € AC({a, b]) be such that H’ = 0 almost everywhere. Then H 
is a constant function. 


Proof Clearly, it suffices to consider only the case of real-valued functions. We will 
show that H(c) = H(a) for all c € (a, D]. 

Let e > 0 and choose 5 > 0 as in the definition of absolute continuity. The set 
E = {x € (a,c) : H’(x) = 0} has measure A(E) = c — a. For each x € E, there are 
arbitrarily small numbers / such that [x, x + h] C [a, b] and 


\A@ + h) — H()| < em 
c-a 


The family of all these intervals is a Vitali covering of E and thus, there is a 
sequence {[xx, x; + hy]}7"_, of mutually disjoint intervals such that 


ME \ Up Lee, Xk + he]) < 6. 


Therefore A((a,c)) = ACE) < 6+ yy hx. Without loss of generality, we can 
assume that x} < x2 <-+-- <X,. The sum of the lengths of the intervals 


(a, x1), (1 + Ay, x2),..., &m + hin, €) 


(that is, the measure of the complement of (a, c) \ Ur LK, Xe + h]) is less than 3, 
which yields 


m—1 
|H (a) — H(x1)| + > |H (xk + he) — A (xe) + | Qin + hm) — H(c)| < €. 
k=1 


Then 


|H(a) — H(c)| < |H(a) — HQ1)| 
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m—1 


+ >> A(x, + he) — AI 
k=1 


+ |HQm + hm) — HO) + >> AG + he) — Ax)| 
k=1 


and since ¢ > 0 was arbitrarily fixed, it follows that H(c) = H(a). 


There exist real-valued functions F on [0, 1] such that F(0) = 0, FC.) = 1, Fis 
continuous and strictly increasing and F’ = 0 almost everywhere. See Chap. 9, the 
section of Notes and Remarks. According to Lemma B.4.2, such a function is not 
absolutely continuous. 


B.4.3 Fundamental Theorem of the Integral Calculus (The Lebesgue version) 
(a) If f € L'({a, b]), then the function 


x 


Fay = f fear x € [a,b] 


a 


is absolutely continuous and F' = f almost everywhere. 

(b) Let F : [a,b] — C be an absolutely continuous function. Then F is almost 
everywhere differentiable and the function F' (extended arbitrarily at the points of 
nondifferentiability) is Lebesgue integrable. Moreover, 


Fa) = Fa) + | Far forall x € [a,b]. 


a 


Proof (a) According to Theorem 11.6.3, for every ¢ > 0, there is 5 > 0 such that 


A € SCR) and A(A) < 6 imply ‘l [f| dx <e. 
A 


Then for every finite family ((ax, 9) ya of disjoint open subintervals of [a, b] such 
that 


n 
>) Ge — ax) <4, 
k=1 


we have 
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n n Dx n Dx 
> IF, - Fla) = >) li f@dt| <> / Ol dt 
k 


k=1 k=1 ax =1q, 
= / If| dx <e. 
Uke (ak. db) 


This proves the absolute continuity of F’. 

The formula F’ = f almost everywhere follows from Fubini’s Differentiation 
Theorem. 

It suffices to consider the case where f is positive. Since f is integrable, it is 
the almost everywhere limit of an increasing sequence of step functions g,. See 
Remark 11.5.5. Put 


x 


enc) = | g(a 


a 


Then ® | (x) = @p(x) except at the points of discontinuity of g(x). By Beppo-Levi’s 
Theorem, ®,(x) > F(x). Since 


F(x) = Oo(x) + D2 (ni @) — On), 
n=0 


from Fubini’s Differentiation Theorem we conclude that 
CO 
F'(x) = (x) + D (141) — &),@)) =f) 
n=0 


almost everywhere. 

(b) Notice that F has bounded variation, so we can apply to it Theorem B.2.4. 
As concerns the formula relating F and F’, this follows from the assertion (a) and 
Lemma B.4.2. 


Exercises 


1. Prove that the function ,/x is absolutely continuous on [0, 1]. 
2. Prove that the formula for variation in Exercise 2, Sect.B.3, works also for the 
absolutely continuous functions. 
. Prove that the Cantor-Lebesgue singular function is not absolutely continuous. 
4, Let F : [a,b] — C be an absolutely continuous function. Prove that for every 
€ > 0, there is 6 > O such that 


WwW 
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m 
> op (Ik) < € 
k=1 


for every family (J;);"_, of pairwise disjoint open subintervals of [a, b] for which 


ke 1 Lk) <6. 
5. Infer from the preceding exercise that every absolutely continuous function F : 
[a, b] — C is a N-function, that is, a function such that 


Ac [a,b] andA(A)=0 implies A(F(A)) = 0. 
Note A remarkable result due to Stefan Banach asserts that the converse holds in 


the class of continuous functions of bounded variation. See [10], pp. 288-290. 


6. (Integration by parts of absolutely continuous functions). Let F and G be two 
functions in AC([a, b]). Prove that 


b b 


/ F'(t)G(t) dt = FG|? — / F(t)G'(t) dt. 


a a 


Note As concerns the change of variable formula, the following result can be 
found in [10], pp. 343-344: Let p be a monotone continuous N-function with 
domain [a, b] and range [a, B] (a < B). Then ¢ is absolutely continuous and for 


every f € L'([a, Bl), we have (f 0 @) ly’ | € £1 (a, b]) and 


B b 
frow= [row |o'oo| dx. 


7. (Absolute continuity of convex functions). Let A be an open interval anda ¢€ A. 
Prove that a function F : A > R is convex if and only if there is an increasing 
function f : A — R such that 


F(x) = F(a) + / f(t)dt forallx eA. 


8. Letf,g € L'({a, b}) and a, B € C. Consider the functions 


x 


Faysat f fear and Guy = p+ f gina 


Prove that: 
(a) The function FG is absolutely continuous. 
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(b) (FG) = F/G+ FG’. 


(c) f?fOGO dt = FG|? - Ca F(t)g(t) dt. 
(Hint: For (a), notice that |F(u)G(u) — F(v)G(v)| does not exceed 


sup |F(x)| - |G(u) — G(v)| + sup |G@)| - |F(u) — F)|. 


For (c), see Theorem B.4.3(b).] 


B.5 The Sobolev Space W!:!(7) 


In the applications of mathematical analysis to differential equations, there are several 
spaces of differentiable functions that show up in a natural way. For example, to every 
open bounded interval J and every number k € N* U {oo}, we can attach the space 
ce (1) (of all C* functions f : I — ©), and its subspace ct (7) (consisting of all 
functions with compact support). Also naturally arise several spaces of functions 
almost everywhere differentiable, like Lip (1), the space of Lipschitz functions f : 
I> C, and AC (I ), the space of absolutely continuous functions. 

In 1930’s, S. Sobolev showed how to construct new spaces of almost everywhere 
differentiable functions based on the integration by parts formula. 

The Sobolev space W!-!(D) consists of all functions fe L! (1) for which there is 
geé L!(1) such that 


/ fog! (x)dx = — / g(x)o(x)dx forall gy € C%(D). (B.2) 
I 


I 


By the integration by parts formula, AC(J) C W!:!(J), the formula above being 
true for g =f’. 

The function g that appears in the formula (B.2) is almost everywhere uniquely 
determined by f and is called the generalized derivative. It will be denoted Df. The 
almost everywhere uniqueness of g follows from the next result. 


B.5.1 Lemma Leth € £!(1) be such that 


- h(x)p(x)dx = 0 


I 
for all functions p € C2? (1). Then h = 0 almost everywhere. 


Proof It suffices to consider only real functions. Using the Weierstrass Approxima- 
tion Theorem, the equality in the hypothesis holds for every y € C,(/, R). 

Let ¢ > 0. By Theorem 11.5.8, there is hg € C-U/, R) such that ||h — hg||, < . 
Then Re heQ dx| <||@||0 for every gy € C.(/, R). Let 
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Ki = {x €l:he(x) < —e} and Ko = {x € 1: h,(x) > é}. 
By Urysohn’s Lemma (see Exercise 7, Sect.6.1), there is a continuous function 


w :I— [-1, 1], with compact support, such that yy = —1 on Kj and y = | on Kp. 
Then 


[inet dx < / he dx| + / |he| dx 
I 


1UK2 1 \ (KjUK2) 


aed / rere oe 
1\ (K{UK2) 


Therefore 
i |h(x)| dx -/ |h(x) — he (x)| act [ |he(x)| dx < (2+ A(1)) 
I I I 


which yields that h = 0 almost everywhere (since ¢ > 0 was arbitrarily fixed). See 
Corollary 11.3.12. 


B.5.2 Corollary Leth € L'(1) be a function such that 


J beow'enax = 0 


I 
for all functions g € C2° (I). Then h is constant almost everywhere. 


Proof Choose a function w € Ce°(J) such that i w(x) dx = 1; see Exercise 2, 

Sect. 8.5. Then for every y in CO°(/), the function g = wy — ‘CF wi) dy) w belongs 

to C(I) and { g(x) dx = 0. Therefore, there is a unique ® € C2°(/) such that 
1 


®’ = g. According to the hypothesis, 


0= f ne'erar= | W(x) — [vor w(x) | h(x) dx 
I 


I I 


= nox) — ff noywoay w(x) dx 


I I 


and since w is arbitrary, we conclude that h = if h(y)w(y) dy almost every- 
where. 
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B.5.3 Theorem W!!(1) consists of all functions f : 1 > C for which there exists 
h € AC() such that f = h almost everywhere. 


Proof Let f € W'!(1) and let g be the generalized derivative of f. For c € I 
arbitrarily fixed, consider the function 


x 


Ge) = f gina x eT, 


Cc 


This function is absolutely continuous on / and for every g € Ceo (1), we have 


J fre - Ge] e'@ar=— [ gnoemar+ [ Gcoperar =o, 
I I I 
By Corollary B.5.2, there is a constant C such that f —G = C almost everywhere. 


Thus, h = G+ C is absolutely continuous on J and f = h almost everywhere. 
The other implication follows from Theorem B.4.3. 


Now we can see that f € Lip! (/) is equivalent to f € AC(/) and f’ € L&(/). 
This fact, combined with the last theorem, gives us the possibility of defining a 
whole string of intermediate spaces in between Lip! (I ) and AC (I ), indexed after 
p € (1, 00), defined by the condition |f’|/? € £'(/). 


Exercises 


1. (a) Prove that the signum function is the generalized derivative of the absolute 
value function. Then, infer that the restriction of the absolute value function 
to (—1, 1) belongs to W!:!((—1, 1)). 
(b) Prove that the restriction of the signum function to (—1, 1) does not belong 
to W!:!((-1, 1)). 
2. Suppose that J and J are bounded open intervals andJ C J. Prove thatf ¢ W!!(1) 
implies f|y ¢ W'!(/). 
3. Let J; and Jp be two bounded open intervals such that J) N 2 4 @. Suppose that 
fi€ Ww!) and fo € W! (hb) are two functions such that fila, = felynn- 
Prove that the function 


_ffA@ ifxeh 
P= 6). HERE Ie 


belongs to Ww! Ub). 

4. Suppose that f ¢ W!:!(/) and its generalized derivative is 0. Prove that f is 
constant almost everywhere. 

5. The natural norm of the Sobolev space Ww!) is 
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Ulli = : (fl + [Df ax. 
I 


Verify that this is indeed a norm. Then, prove that W!:!(/) is complete. 


B.6 Notes and Remarks 


The set of nondifferentiability of a continuous function made the object of a consid- 
erable amount of research. See the survey paper by Bruckner and Leonard [11]. We 
shall recall here a striking result due to Zygmunt Zahorski: [f E C R is the union 
of a Gs-set and a null set that is the union of Gs-sets, then there is a continuous 
function f : R — R whose derivative exists nowhere on E but everywhere outside 
E. This is a consequence of the following two facts. For each Gs-set Eg C R, there 
is a uniformly continuous function f : R — R, differentiable everywhere outside 
of Ep and whose set of Dini derivatives Dt f (x), Dif (x), D~f (x), D_f (x) includes 
the values —oo and oo, at each point of Eo. If E; C Ris a null set of type Gs, then 
there is a Lipschitz function f such that f’(x) exists precisely for x ¢ Ej. 

For a € (0, 1], denote by Lip® ([0, 1]) the space of all functions f : [0,1] ~ R 
which are Lipschitz of order a. Then 


C! ({0, 1]) C Lip! (10, 1]) C No<a<1Lip® ((0, 1]) C Lip* ([0, 11) c C (0, 1) 


for all a € (0, 1] and all inequalities are strict. By Corollary B.3.4, every Lipschitz 
function is almost everywhere differentiable. It was noticed by Jozef Marcinkiewicz 
[12] that if f € Lip! ({0, 1]), then for every ¢ > 0 there is g € C! ({0, 1]) such that 
A ({x : f(x) 4 g(x)}) < e. The requirement that f belongs to Lip! ({O, 1]) cannot be 
weakened tof € (\o_,—, Lip® ((0, 1]). An example is offered by the Takagi-van der 
Waerden function, 


Cc 
10"t 
v:R-R, v(t) => ) 
n=0 


where g(t) denotes the distance from f to the nearest integer. Indeed, v belongs to 
Noew<1 Lip® ([0, 1]) , but for each subset X C [0, 1] of positive measure, one can 
prove that the set pees ixyeM,xF y is unbounded. See [13]. 

A necessary and sufficient condition thataset E C [a, b] be the set of discontinuity 
of a derivative is that E be an F, set of the first category. See the aforementioned 
paper of Bruckner and Leonard [11]. In particular, there are derivatives which are 
discontinuous almost everywhere on a dense set. Alfred K6pcke (1887) has shown 
that there exist differentiable functions f : [a,b] — R such that f’ is bounded and 
both sets {x f(x) > 0} and {x f(x) < o} are dense in [a, b]. 
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As noticed N. Luzin in 1915, every measurable function f : [a, b] > Ris almost 
everywhere the derivative of a suitable continuous function. 

The concept of function of bounded variation was introduced by Jordan [14] in 
1881, who was seeking a sufficient condition for a function f to have a Fourier series 
that sums to f. See Theorem 12.4.1. Several important generalizations of the notion 
of bounded variation can be found in the paper of Pierce and Velleman [15]. 

The first who considered the notion of generalized derivative was Henri Poincaré, 
at page 100 of his paper Sur les équations de la Physique mathématique, Rendin- 
conti del Circolo Matematico di Palermo, 8 (1894), 57-155. Sobolev spaces were 
introduced by Sergei L. Sobolev in the paper Méthode nouvelle a résoudre le prob- 
léme de Cauchy pour les équations linéaires hyperboliques normales, Rec. Mat. 
(Matematicheskii Sbornik) 1 (1936), 39-72. A thorough presentation of the theory 
of Sobolev spaces and its applications to partial differential equations can be found 
in the recent book of Willem [16]. 

The variation of a function f € W!:!(Z) is defined as V(f) = f; |Df| dx. If f is 
real-valued, V(f) equals 


VO) = sup [sola :geCl(D, lgl<1 
I 


This remark led Ennio De Giorgi to extend the concept of function with bounded 
variation to the context of integrable functions. Precisely, a function f € L!(/) has 
bounded variation if V (f) < oo. The importance of this extension is discussed 
by Enrico Giusti in his book, Minimal surfaces and functions of bounded variations, 
Birkhauser Verlag 1984. 


Appendix C 
The Riemann-Stieltjes Integral 


The Riemann-Stieltjes integral is a generalization of Riemann integral with many 
applications in mathematics, mechanics, probability theory etc. It formally results 
by replacing the Riemann integral sums by Riemann-Stieltjes integral sums, where 
the role of the length of partial intervals, @ ([xx, xx41]) = x41 — XK, 18 taken by a 
measure of the form m ([xx, xX4-41]) = g(%e41) — g(x). 


C.1 Integral Sums and Integrability 


In what follows, we describe a generalization of the Riemann integral which allows 
(under certain circumstances) to integrate a function f with respect to another function 
g. Both functions are supposed to be real-valued and defined on the same nondegen- 
erate compact interval [a, b]. 

As in the case of Riemann integral, we start by associating to each division A 
consisting of the points 


a=x0 <X1 <-++<mM=), 


and to each intermediate A—point system € = (E29, an integral sum 


n—1 


Sai” = > f Ex) (Geer) — 9O%)). 


k=0 


C.1.1 Definition A function f is called Riemann-Stieltjes integrable with respect 
to the function g if there is a number J with the following property: for every ¢ > 0, 
there is 6 > 0 such that for every division A with ||A|| < 6 and every choice of an 
intermediate A—point system €, we have 


lI — Scaen fs 9)| < &. 
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When such a number J exists, it is uniquely determined by the pair f and g. It is 
called the Riemann—Stieltjes integral of f with respect to g (over the interval [a, b]) 
and denoted by one of the symbols 


b b 
[serages ana frag. 


As in the case of Riemann integral (which represents the particular case where 
g(x) = x), we define 


a b 
J feoage = [ Forage 
b a 
and 
J Fevaa =0. 


The Riemann-Stieltjes integral reduces to the Riemann integral not only in the 
aforementioned case, but also under much more general circumstances. 


C.1.2 Proposition [ff € C({a, b],R) and g € C'({a, b],R), then f is Riemann— 
Stieltjes integrable with respect to g and 


b b 
[rags [ra'aw 


Proof Indeed, by Lagrange’s Mean Value Theorem, 


n—-1 n—-1 
Sani” = > fEd Ge) — 96%) = DOF EE (nk) Onp1 — x4), 
k=0 k=0 
where nx € [xx, X%41] fork = 0,1,...,n— 1. Since g’ is uniformly continuous, for 


é > O arbitrarily fixed, there exists 6 > 0 such that 


7 < E 
|x — y| < 6 implies ae) = J ()| < Vino a" 
(oe) 
Therefore, if || A|]| < 6, then 
Sca,enfs 9) — ScaenF9's idja,o))| 
n—1 
< So EI |g’ Ee) — 9) | Ori — xe) < €. (C.1) 


k=0 
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The existence of the Riemann integral tbs fg'dx assures that 


b 
Sante ites / fag’dx as |All > 0, 
a 


so by (C.1) we conclude that f is Riemann-Stieltjes integrable with respect to g and 
b b 
Jaf 49 = Ja f g’dx. 


Proposition C.1.2 also works when g is an absolutely continuous function, but in 
this case the right hand side integral is a Lebesgue integral. 

The role of f and g in Proposition C.1.2 can be reversed. See Theorem C.1.7. 

The next four results collect a number of easy (but important) properties of the 
Riemann-Stieltjes integral. As above, all functions involved are defined on the same 
interval [a, b]. 


C.1.3 Lemma (Linearity of Riemann-Stieltjes integral) Suppose that a, and a2 are 
two real numbers. 

(a) If fi and f2 are Riemann-Stieltjes integrable functions with respect to g, then 
the function ayf, + a2f2 is also Riemann-—Stieltjes integrable with respect to g and 


b b b 
oan + a2f2) dg = ar [ fidg +e f fy. 


(b) If f is a Riemann-Stieltjes integrable function with respect to each of the 
functions g\ and gz, then the function f is also Riemann-—Stieltjes integrable with 
respect to 01g, + a2g2 and 


b b b 
[fac Seo easy [fa igs | fam. 
a a a 


C.1.4 Lemma (Positivity of Riemann-Stieltjes integral) Suppose that f : [a, b] > 
R is a positive function and g : [a, b] > Ris an increasing function. If f is Riemann— 
Stieltjes integrable with respect to g, then yy dg = 0. 


C.1.5 Lemma (Additivity of Riemann-Stieltjes integral) Let c € [a,b]. If f is 
Riemann-Stieltjes integrable with respect to g on the intervals [a, c] and [c, b] then 
f is also integrable with respect to g on [a, b] and 


[room freoe fra 
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C.1.6 Lemma (Property of calibration) The function identically 1 is integrable with 
respect to any function g on [a, b] and 


b 


[os = g(b) — g(a). 


a 


The Riemann-Stieltjes integral has a property of reversibility outlined by the 
following theorem. 


C.1.7 Theorem (Integration by parts) /f [, i f dg exists, then f. : g Of also exists and 


b b 


/ fio =F OO) -FON@)= / odf. 


a a 


Proof Given a division A consisting of the points 
a=Xxo <x <-+++ <x, =), 


and a system € = (ke , of intermediate A—points such that & € [xx-1, xx] for 
each k, we have 


Sca.eGif) = >. gE &e) —f @e-1)) 


k=1 
n n—1 
= >) 9GF 04) — DY Eero Ox) 
k=1 k=0 
n—1 
=f (%n)9En) — Df Ox) GEe+1) — gE) — F009 E1) 
k=1 


= f(b)g(b) — f(a)g(a) — >" f %) (G41) — 9) 


k=0 


=f b)g(b) —f@g@ — Seyi 9). 


where 9 = a, &)41 = band . is the division of [a, b] obtained from & by adding the 
end-points a and b (and counting repeated points only once); A is the intermediate 


€-point system generated by the points of A. 
Since lé | < 2||Al| and the integral ihe dg exists, the integral sums Sq ¢)(g; f) 


have limit as || A|| — 0 and this fact implies both the existence of the integral ie g df 
and the formula in the statement. 


A formula of change of variable makes the objective of Exercise 7. 
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The most important case of integrability is that of continuous functions with 
respect to functions with bounded variations. 


C.1.8 Theorem Every continuous function f : [a,b] — R is Riemann-Stieltjes 
integrable with respect to every function g : |a, b] — Rwhich has bounded variation. 


Proof Due to the property of linearity of the integral (see Lemma C.1.3(b)), it suffices 
to consider the case where g is increasing. The rest of the proof is done by adapting 
the Darboux Criterion of Riemann Integrability. 

Let A = Oro be a division of [a, b]. We attach to A the bounds 


m(A)= min f(x) and M(A)= max f(r) 
XE[xK XE+1] 


xe [XK XK41] [XK 


and define the lower Darboux-Stieltjes sum and the upper Darboux-Stieltjes sum 
respectively by the formulas 


n—1 


LDSa(f; 9) = > me (AV GOH1) — 90%) 
k=0 


and 


n—1 


UDSa (fi: 9) = >) Me(A)(9 41) — ge): 
k=0 


If A> is finer than Aj, then 
LDSa,(f; 9) < LDSa,(f; 9) < UDSa,(f; g) < UDSa, F; 9) 
and thus, for every two divisions A, and Ao, 
LDSa,(f; 9) < UDSa,(f; 9), 


that is, every lower Darboux sum does not exceed any upper Darboux sum. 
Since 


LDSa(f; 9) < Sae(f: g) < UDSA(f; g) 


for every intermediate A—point system &, the above discussion leads to the con- 
clusion that the integrability of f with respect to g can be derived via the following 
easy-to-check condition: for every e > 0, there is 5 > 0 such that 


UDSa(f; g) — LDSa(f; 9) < € 


510 Appendix C: The Riemann-Stieltjes Integral 


for all divisions A with || A|| < 5. Due to the property of a continuous function to be 
bounded on any compact interval and to attain its bounds (Theorem 6.5.1), we have 


n—-1 


0 < UDSaff; g) — LDSa(f; 9) = > (My (A) — m(A)) (9x41) — 9K) 
k=0 


< of (AI) (9) — g@), 


and the proof ends by taking into account the property of uniform continuity of f. 
See Theorem 6.5.5. 


C.1.9 Remark (Integrability of complex-valued functions) A complex-valued func- 
tion f is Riemann-Stieltjes integrable with respect to an increasing function g if and 
only if both Ref and Im/f have this property. Moreover, 


b b 


b 
[ tou= | Refi) dg +i [ Im f(x) dg. 


a a 


This follows easily from the fact that 


Re Sca,ey (fs 9) = Sca,ey(Ref; g) and Im Sca,z) (Ff; 9) = Sca,ey (mf; 9). 


As a consequence, the integrability of complex-valued functions can be reduced to 
the integrability of real-valued functions. Moreover, 


b 
| fea) s sw veoivig. 
A x€[a,b] 


C.1.10 Remark (Improper Riemann-Stieltjes integrals) Following the model of 
Riemann improper integrals, one can consider Improper Riemann-Stieltjes inte- 
grals. For example, if f : [a, oo) — C is acontinuous function and g : [a,oo) > R 
is a function having bounded variation on each compact subinterval of [a, 00), we 
define 


J foag= tim [ Feag. 


provided that the limit exists. 


Exercises 


1. Compute the integrals: iia sin x d |x| and hie x3di/|x]. 


ny —s—] — _1i-: nl i 
2. ae that di x *~'d[x] =—;n Dose for every s > 0 and every integer 
n=l. 
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3. Suppose that f : [a,b] — R is a continuous function, (x,);_, is a family 
of distinct points in (a, b) and (ch) , is a family of positive numbers. Prove 
that f is Riemann-Stieltjes integrable with respect to the increasing function 
g(x) = > y-1 Ch (x — xx) and 


n 


b 
| f80= Daren. 


k=1 


Here H(x) denotes the Heaviside function. 
4. Consider the functions 


_ [0 ifx €[0, 1/2) _ [1 ifx € (0, 1/2) 
fo) ={{ ifx €[1/2, 1] and ats) = |) ifx €[1/2, 1] ° 


Prove that f is not Riemann-Stieltjes integrable with respect to g. Extend the 
conclusion for any pair of functions having a point of discontinuity in common. 

5. Prove the properties of linearity, positivity and additivity of the Riemann-Stieltjes 
integral. 

6. (The extension of Lebesgue’s Criterion of Riemann integrability). Suppose that 
Ff : [a,b] = R is a continuous almost everywhere and bounded function. Prove 
that f is Riemann-Stieltjes integrable with respect to every Lipschitz function 
g: [a,b] > R. 

7. (Formula of change of variable). Suppose that f is Riemann-Stieltjes integrable 
with respect to g on [a, b]. Let x = x(t) be a continuous and increasing function 
on [c, d] such that x(c) = a and x(d) = b. Then the function F(t) = f(x(d)) is 
Riemann-Stieltjes integrable with respect to G(t) = g(x(t)) on [c, d] and 


d 


b 
[ro dG(t) = [ro dg(x). 


¢c 


8. (An extension of Cauchy-Buniakovski-Schwarz Inequality). Suppose that a : 
[a,b] — R is an increasing function and f, g are two real-valued continuous 
functions on [a, b]. Prove that 


1/2 1/2 


b b b 
fl f(x~)g(x)da(x)| < / (f (x))? da(x) / (g(x))* da(x) 


What’s the analogue of this inequality in the case of functions defined on intervals 
of the form [a, co)? 

9. Following the model of Exercise 8, formulate and prove the extension of 
Chebyshev’s Algebraic Inequality to the context of Riemann-Stieltjes integral. 
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C.2 Applications to Probability Theory 


Given a probability space (Q, X, P), one can attach to it a theory of integration 
that mimics the construction of Lebesgue integral. The starting point is outlined in 
Exercise 3, Sect. 11.2. Full details are available in many books such as those by Chow 
and Teicher [17] and Hewitt and Stromberg [10]. 

The analogue of real-valued measurable functions in the context of that integral 
is provided by the random variables, that is, by the functions X : Q — R with the 
property that X~!(A) € © for every Borel subset A of R. If a random variable X 
is integrable with respect to P, its integral E(X) = te X(q@)dP(w) represents the 
expectation (or the mean value) of X. This is the expected value to find if one could 
repeat the experiment to which the random variable is attached an infinite number of 
times and take the average of the values obtained. 

A very convenient way to introduce the expectation of a random variable X is 
provided by the Riemann-Stieltjes integral. The key ingredient is the distribution 
function of X, which is defined by the formula 


Fy: R->[0,1], Fx@) =P (fo: X(@) <x}). 


This function is increasing and its limits at infinity are lim Fy(x) = O and 
xX—>—0o 
lim Fy(x) = 1. Moreover, lim Fy(x) = Fy(xo) at every x9 € R. The proof is 
X—>0O X—>x9—- 
left to the reader as Exercise 1. 


The distribution function allows us to introduce the expectation of a random 
variable by a formula that avoids the use of measure theory. Precisely, 


(ee) 


E(X) = J 2aPxco. 


—oco 


provided the right hand side integral exists. 
The variance of the random variable X, 


Var(X) = E (x - E(x))’) 
-E (x’) ~ E(x), 


is a measure of how spread out are its values. A variance of zero indicates that all 
the values of X are identical (except possibly for a subset of Q of probability zero). 
In statistics, the expectation and the variance are denoted respectively yz and o°. 
The square root of variance is called the standard deviation and denoted o. 
There are two particular types of random variables important in applications. 
A random variable X is called discrete if it takes a countable set of values x1, x2, 
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X3,..., respectively with probabilities p;, p2, p3,..., and °°, pn = 1. In this case 


ioe) 
E(X) = > an 
n=1 


and 


foe) CO 2 
Var(X) = >) paxa — (> Pw) 
n=1 n=1 
1 Co (oe) > 
= 5 DL pips (i — 4) - 


Arandom variable X is called continuous if its distribution function is of the form 


Fy(x) =P ({@:X(@) < x}) = [roa 


for a suitable Lebesgue integrable function f € La (R), called the density of Fy. In 
this case, the probability that X takes a value @ is 0 and 


b 
P(wiasX(o) <b) = f finer forall -wK<a<b<om. 


Moreover, the computation of the expectation and of the variance of X reduces to 
the computation of certain Lebesgue integrals: 


[ee 


E(X) = i; xf (x)dx and Var(X) = / (x — E(X))* f(@)dx. 


—oo 


A continuous random variable X is called normal if its distribution function is 
associated to a density of the form 


_ nu)? 
e@ 2%, 


1 
PO) =" 


In this case, the values of the parameters 4 € R and o > 0 are precisely the 
expectation and the variance of X. See Exercise 4. 
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Exercises 


1. Prove the following properties of the distribution functions: 

(a) lim Fy(x) =Oand lim Fy(x) = 1; 
A= =O XO 
(b) on Fx (x) = Fx (xo) for every xp € R. 

2. A Poisson random variable (with parameter A > 0) is a discrete radom variable 
X which takes the value k with the probability pz = a fork =0,1,2,.... 
Prove that the mean value and the variance of such a random variable verify the 
formulas E(X) = Var(X) = i. 

3. A binomial random variable is a discrete random variable taking value k with 
probability p(k) = (7)p* (1 — p)”*, fork = 0, 1,2,..., 7. Here p € (0, 1) isa 
parameter. Prove that the mean value and the variance of such a random variable 
verify the formulas E(X) = np and Var(X) = np(1 — p). ; 

(ca) 


4. Consider a normal random variable X whose density is . Te e 202 , Prove that 
E(X) = wand Var(X) = 0”. 


C.3. Notes and Remarks 


Stieltjes [18] has introduced the integral bearing his name in 1894, in connection 
with his research on continuous fractions. The theory of this integral can be found 
in many books such as [10] and [19]. It is worth mentioning here the classical result 
of Laurence Chisholm Young [20] stating that the integral [ 2 fdg is well-defined if 
f is Lipschitz of order a > 0 and g is Lipschitz of order 8 > 0 witha + 6 > 1. See 
also [15]. 

In Sect. C.2 we outlined the role played by the Riemann-Stieltjes integral in defin- 
ing the expectation of a random variable. The list of applications can be continued. 

The contour integral in complex analysis and the line integral of a vector field 
along a rectifiable curve both are special cases of Riemann-Stieltjes integrals. See 
Protter and Morrey [21], Sect. 16.3. 

Riesz [22] has proved in 1909 the following surprising connection between the 
Riemann-Stieltjes integral and the continuous linear functionals on the Banach space 
C([a, b)). 


C.3.1 Theorem (Riesz Representation Theorem) Every continuous linear func- 
tional g on the Banach space C([a, b]) is of the form 


b 
o= ; FlxdO(x), 


where ® is a complex-valued function of bounded variation on [a, b], and moreover 
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b 
Ilgll = va®. 


For details, see Bhatia [23], pp. 52-59. 
The following characterization of completely monotone functions via Riemann— 
Stieltjes integral is due to Bernstein [24]: 


C.3.2 Theorem A function f : (0,00) — R is completely monotonic if and only if 
it is of the form 
CO 


f (x) = f emaan, 


0 


where a(t) is increasing and the right hand side integral converges for every 
x € (0, 00). 


For details, see Widder [19], Chap.4, Theorem 12a. 
Last but not least we should mention here the Tauberian theorem of Jovan 
Karamata. 


C.3.3 Theorem Leta : (0,00) — Rbe an increasing function such that the integral 
i e “'da(t) converges for all x > 0 and 


Co 
lim x” [eau =C, 


x0 


0 


for some strictly positive constants y and C. Then for every function f € C([0, 1]) 
we have 


Tim x? peer = ron | FOr eat. 


For details, applications and extensions, see Bingham et al. [25] and Widder [19]. 
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Symbols 


|A|, #(A) 
diam (A) : 
L(A): 
XA: 

[x] : 

Lx] : 

{x}: 

id: 


the classical numerical sets (naturals, integers, rationals, reals and complex) 


the set of positive integers 

the set of nonnegative real numbers 
the set of positive real numbers 
the set of extended real numbers 
Euclidean n-space 

the power set of A 

empty set 

the o-algebra of Borel sets 

the o-algebra of all Lebesgue measurable sets 
Euler’s constant 

closure of A 

interior of A 

open ball center a, radius r 
closed ball center a, radius r 

the set of all neighborhoods of a 
cardinality of the finite set A 
diameter of A 

length of the interval A 
characteristic function of A 
ceiling function 

integer part (floor function) 
fractional part 

identity 

conjugate of f 

Fourier transform of f 

image of A 

inverse image of A 

the graph of f 
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mgf : 
f\k: 

B: 

Tr: 

Rez: 
Imz: 

Zt 

supp f: 
By(x): 
Bn(f; x): 
cog ([u, v]) : 
of (a): 

lim inff(x) 
lim supf (x) 


Df (a) and Df (a) : 

D?f(a) and Df (a) : 

fi(@) and f(a) : 

f@, £@), f@: 
" af v2 : 

f'@, F2(a), f(@: 

f@, Fo: 
C(K,R), C(K): 
Cp(T, R), Cr(T) : 
C'd, R): 

Crd, R), Ch): 

St([a, b], R) : 

R([a, b], R): 

R({a, b], R): 

M(R, R), M(R) : 

£P(A,R), L(A): 
Ile: 

BV (a, b]) : 

Vi: 

Lip® ({a, b]) : 

-llzip : 

AC(): 

wily: 


co(N, R), co(N) : 
c(N, R), c(N) : 
eh), 2): 
2 (N), C7(N) : 
£2 (N), £°(N) : 


Symbols 


range (image) of f 
restriction of f to K 
the beta function 

the gamma function 
real part 

imaginary part 
conjugate of z 
support of f 
Bernoulli polynomial 
Bernstein polynomial 
oscillation of f on [u, v] 
oscillation of f at a 
lower limit 


upper limit 


lower and upper derivatives 


lower and upper second symmetric derivatives 
one-sided derivatives 


first derivative 
second derivative 


the nth derivative 

space of continuous functions on K 

space of continuous and bounded functions on T 
space of functions of class C” 

space of functions of class C” with compact support 
space of step functions 

space of regular functions 

space of Riemann integrable functions 

space of measurable functions 

space of p'’-power Lebesgue integrable functions 
LP? |-seminorm 

space of functions with bounded variation 
variation of f on [a, b] 

space of Lipschitz functions of order a 

Lipschitz constant 

space of absolutely continuous functions 

Sobolev space on I 


space of sequences convergent to 0 
space of convergent sequences 

space of absolutely summable sequences 
space of square-summable sequences 
space of bounded sequences 

end of a proof 
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Antiderivative, 291 
Archimedean Property, 21 
Asymptote 

horizontal, 148 

slant, 148 

vertical, 148 
Asymptotic formulas, 148 
Attractor, 469 

global, 469 
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B 
Babylonian Algorithm, 47 
Ball 


closed, 112 
open, 112 
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orthonormal, 441 
Benford’s Law, 332 
Bernoulli-L’ H6pital Rule 

monotone form, 274 
Benoulli-L’ H6pital Rule 

case 0/0, 230 
Bernoulli-L’ H6pital Rule 
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Cauchy’s functional equation, 195 
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Darboux Criterion, 316 
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Differential, 219 
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Dimension 
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topological, 135 
Dini’s Lemma, 173 
Dirichlet’s Principle, 24 
Discontinuity 

of first kind, 149 

of second kind, 149 
Discrete dynamical system, 467 
Disk of convergence, 197 
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to aset, 145 
Divergent sequence, 40 
Division, 281 

tagged, 333 
Domain, 3 
Dynamical system 

chaotic, 475 

topologically conjugate, 476 
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Cauchy’s, 344 
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Euler-Maclaurin, 346, 347 
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Leibniz’s, 222 
Leibniz-Newton, 293 
Moivre’s, 210 
Poisson Summation, 455 
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Wallis’, 298 
Weierstrass’, 359 
Forward differences, 102 
Fourier coefficient, 440 
Fractal, 137 
Function, 3 
(HK) integrable, 334 
absolute value, 10, 15, 76 
absolutely continuous, 495 
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arcsine, 207 
arctangent, 208 
Baire class one, 181 
Baire class two, 183 
beta, 358 
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bounded, 13 
bounded above, 13 
bounded variation, 504 
Cantor-Lebesgue, 491 
characteristic, 13 
choice, 4 
complete elliptic integral, 495 
completely monotonic, 434 
conjugate, 77 
continuous, 140, 161 
contraction, 161, 469 
convex, 260 
cotangent, 206 
cumulative distribution, 512 
decreasing, 13 
differentiable, 217 
dilogarithm, 249 
discontinuous, 140 
error, 292 
essentially bounded, 412 
even, 14 
exponential, 185, 201 
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gamma, 352 
gauge, 333 
Heaviside’s unit step, 142 
Henstock - Kurzweil integrable, 334 
hyperbolic cosinus, 211 
hyperbolic sinus, 211 
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hypergeometric, 213 

identity, 4 

imaginary part, 76 

inclusion, 4 

increasing, 13 

infinitely differentiable, 222 

injective (one-to-one), 4 

integral sine, 293 

inverse, 4 

Lebesgue integrable, 372, 389 

linear part, 219 

Lipschitz, 140, 161 

Lipschitz, of order a, 144 

locally integrable, 337 

locally Lipschitz, 140, 494 

log-convex, 266 

logarithmic, 191 

logistic, 477 

lower semi-continuous, 163 

measurable, 386, 389 

monotone, 13 

N-, 499 

negative, 15 

negative part, 10 

null at infinity, 416 

odd, 14 

of bounded variation, 492 

of class C”, 222 

p-integrable, 410 

periodic, 23 

periodic Bernoulli function, 254 

piecewise C 1 992 

piecewise continuous, 292 

piecewise linear, 172 

polynomial, 13 

positive, 15 

positive part, 10, 15 

power, 193 

quadratic, 14 

real analytic, 199 

real part, 76 

regular, 284, 289 

restriction, 4 

Riemann integrable, 313 

Riemann integrable in the improper 
sense, 337, 338 

Riemann’s, 150 

Riemann-Stieltjes integrable, 505 

ruler, 323 

shift, 478 

signum, 10, 80 

square-integrable, 410 

step, 282, 368 
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strictly convex, 260 

strictly decreasing, 13 

strictly increasing, 13 

strictly monotone, 13 

strictly negative, 15 

strictly positive, 15 

strongly differentiable, 224 

surjective (onto), 4 

Takagi-van der Waerden, 503 

tangent, 206 

uniformly continuous, 157 

upper semi-continuous, 163 

zeta, 174, 253 
Functional, 81 
Fundamental Theorem of Algebra, 76, 177 
Fundamental Theorem of Calculus, 293, 335 


G 

Gauss’ Arithmetic-Geometric Mean, 48 

Gauss’s Test, 213 

Generic property, 133 

Gibbs phenomenon, 464 

Gram-Schmidt orthogonalization algorithm, 
449 

Graphical analysis, 468 


H 

Hamel’s example, 212 

Hausdorff’s separation property, 116 
Homeomorphism, 143, 165 


I 

Identity 
Botez-Catalan, 51 
Hlawka’s, 71 


parallelogram, 70, 439 
Parseval’s, 421 
polarization, 439 
Image, 3 
Imaginary part, 75 
Inclusion of sets, 2 
strict, 2 
Inequality 
absolute value, 11 
absolute value integral, 375 
AM-GM (arithmetic mean—geometric 
mean), 20 
Arithmetic Mean—Geometric Mean, 264 
Bernoulli’s, 21, 226 
Bessel’s, 441 
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Cauchy-Buniakovski-Schwarz, 17, 69, 


352, 438, 511 
Chebyshev’s, 17, 511 
Hardy’s, 415 
Hardy-Littlewood-Pélya, 17, 268 
Hermite-Hadamard, 305 
Iyengar’s, 331 
Jensen’s, 261, 395 
Jordan’s, 226 
Koksma-Hlawka, 331 
Landau’s, 237 
Minkowski, 69 
Ostrowski’s, 331 
Popoviciu’s, 268 
quadrilateral, 113 
Rogers-Holder, 226, 352 
triangle, 69, 111 
Yorke, 495 
Young’s, 226, 300 

Infimum, 12 
Infinite product, 108 
Integral 
convergent, 337 
divergent, 337 
Euler’s, 342 
Lebesgue, 372 
Riemann, 282, 286, 313 
Riemann-Stieltjes, 506 
Interior, 116 
Interval, 54 
bounded, 12 
closed, 12 
nondegenerate, 13, 54 
of convergence, 198 
open, 12 
p-dimensional, 127 
real, 127 
Isometric spaces, 113 
Isometry, 113 
Iterated function system, 480 


K 

Kernel 
Dirichlet, 452 
Fejér, 456 


L 
Landau’s symbols 
O(g), o(g), 148 
Laplace transform, 434 
Law of Cosine, 71 
Leibniz Alternating Series Test, 99 


Leibniz’s Rule 
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of derivation under integral sign, 309 


Lemma 
Barbalat’s, 344 
Borel-Lebesgue, 127 
Fatou’s, 490 
Heine-Borel, 127 
Lebesgue’s Number, 131 
Nested Intervals, 32 
Riemann-Lebesgue, 416, 447 
Urysohn’s, 145 

Limit, 72, 78, 145 
generalized, 65 
inferior, 56, 162, 481 
left inferior, 482 
left superior, 482 
left-handed, 147 
one-sided, 147 
right inferior, 481 
right superior, 481 
right-handed, 147 
superior, 56, 162, 481 
uniform, 170 

Linear interpolation, 165 

Linear lattice of functions, 15 

Lipschitz constant, 230 


M 

Mean 
arithmetic, 20 
geometric, 20 
value, 302 

Measure 
finite additive, 366 
Hausdorff, 136 
Lebesgue, 363, 364, 396 
probability, 367 
o-additive, 366 

Method 
bisection, 154 
Newton—Raphson, 236 


of undetermined coefficients, 247 


Metric, 112 

discrete, 113 
Euclidean, 111 
Modulus of continuity, 157 


N 
Neighborhood, 116 
Norm, 80 

Hilbertian, 438 
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algebraic, 34 
Bernoulli’s, 249 
complex, 74 
conjugate, 76 
Fibonacci, 18 
irrational, 20 
Liouville, 35 
negative, 8 
nonnegative, 8 
positive, 8 
transcendental, 34 


O 
Open cover, 126 
Operator, 81 
Orbit, 467 

hyperbolic, 471 
Orthogonal vectors, 71 
Osborn’s Principle, 211 
Oscillation, 324 


P 
Part 
fractional, 22 
integer, 22 
positive, 10 
Partition, 3 
Paul du Bois-Raymond Criterion, 325 
Peano curve, 165 
Point 
angular, 221 
closure, 117 
condensation, 125 
critical, 228 
fixed, 155, 467 
hyperbolic, 471 
interior, 116 
isolated, 123 
of accumulation, 123 
of inflection, 264 
periodic, 467 
Pointwise convergence, 169 
Polar coordinates, 209 
Polynomial 
Bernoulli, 253 
Bernstein, 256 
interpolation, 259 
Legendre, 233 
trigonometric, 257, 303 
Pompeiu’s example, 277 


523 


Prime period, 23 

Principle 
Inclusion—Exclusion, 368 
localization, 453 
of Mathematical Induction, 9, 17 
Uncertainty, 423 

Product 
scalar, 69, 437 

Property 
intermediate value, 152 


R 
Raabe-Duhamel Test, 94, 102 
Radius of convergence, 197 
Random variable, 512 
continuous, 513 
discrete, 512 
normal, 513 
Range, 3 
Ratio Test, 93 
Real part, 75 
Relation, 3 
equivalence, 3 
Repellor, 469 
Riemann’s Criterion of Integrability, 321 
Ring of sets, 365 
Root Test, 93 
Rotation, 473 
Runge phenomenon, 276 


S 
o-algebra of sets, 366 
Sequence, 16 
absolutely summable, 105 
bounded, 16 
Cauchy, 52, 73, 121 
Cesaro convergent, 61 
convergent, 137 
convergent in density, 64 
Dirac, 403 
Fibonacci, 18 
monotone, 16 
pointwise convergent, 169 
positive, 44 
recurrent, 48 
slowly oscillating, 180 
strictly increasing, 16 
uniformly bounded, 173 
uniformly convergent, 170 
weakly decreasing, 107 
Series 
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absolutely convergent, 88, 172 
alternating, 99 

conditionally convergent, 89, 105 
convergent, 86 

divergent, 86 

Fourier, 441 

generalized harmonic, 93 
geometric, 87 

harmonic, 87 

harmonic generalized, 253 
hypergeometric, 199 

Lambert, 200 

Maclaurin, 239 

pointwise convergent, 172 
power, 196 

summable, 103 

Taylor, 239 

telescoping, 90 

unconditionally convergent, 103 
uniformly convergent, 172 


bounded, 12, 112 
bounded above, 11 
bounded below, 12 
Cantor’s triadic set, 128 
Cantor-like, 166 
closed, 117 

compact, 126 

convex, 54, 135 
countable, 29 
countably infinite, 29 
dense, 23, 125 

derived, 123 
elementary, 363 

finite, 28 

Fo, 432 

Gs, 132 

infinite, 28 

invariant, 468 

Jordan null, 318 
Lebesgue null, 322, 372 
measurable, 386 
nowhere dense, 131 

of cardinality ¢, 33 

of first Baire category, 131 
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of second Baire category, 131 
of zero density, 64 
open, 114 

relatively compact, 128 
residual, 132 

separable, 125 

totally bounded, 131 
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Solution 
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Hausdorff, 137 
Hilbert, 439 
inner product, 438 
locally compact, 128 
measure, 367 
metric, 112 
metric complete, 121 
normed linear, 80 
pre-Hilbert, 438 
probability, 367 
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Dirichlet’s Summability, 104 
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380 
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Measurable Limit, 387 
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